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ELECTRICAL ACTIVITY OF THE NERVOUS SYSTEM: 
I. APPARATUS, RECORDING TECHNIQUES 
AND FIELD OF STUDY* 


Kal JENSEN 
University of Wisconsin 


I. THE PROBLEM that Caton (17) reported experiments in 
which he told of finding distinct fluctuations 
of currents within the brains of living ani- 
mals. While attempting to investigate local- 
ization points on the brains of rabbits and 
monkeys, he noted distinct electrical activity 
within the brain itself. By placing non- 
polarizable electrodes upon the surface of 
the cerebral cortex and the skull and by 
conducting the resulting current through a 
sensitive galvanometer, he found galvanometer 
deflections which varied with the animal’s 
The present program of research, which has psychic function and physiological state, in- 
been under way for some time, involves the creasing with visual and other forms of stim- 
study of: the conditions under which “nor- ylation and disappearing at death. 
mal” electrical activity of the brain may be Fleisch] von Marxow (27) (1890)* carried 
secured; the developmental aspects of cortical (Caton’s work further by showing that through 
potentials from the longitudinal point of peripheral stimulation these deflections were 
view; brain waves under conditions of sleep, increased when the electrodes were placed on, 
emotional disturbance, varying physico-chem- or in the vicinity of, Munk’s visual area. He 
ical conditions, and auditory, visual, tactile was perhaps the first to observe these brain 
and pain stimulation; the location of struc- oscillations by conducting the potentials 
tural abnormalities and pathology of the through the skull of the intact animal. As 
brain by the use of potential waves as signs; q result of his observations on the brain poten- 
the origin and nature of the electrical activity tials of animals, he predicted the possibility 
generated in the brains of the abnormal: the of studying the various psychic actions of 
relationship between differential patterns of | man through the media of electrical potential 
cortical potentials and cytoarchitectonic struc- changes conducted from the scalp. 
ture; and the origin and nature of the neuro- While working on the cerebral cortex of 
physiological correlates of complex forms of dogs, Beck (7) (1890), by placing two elec- 
behavior, such as problem solving, concept trodes on the surface of the cortex, showed 
formation, insight, and learning behavior. the existence of potential changes which had 
no apparent relationship with either the heart 
II. HistorrcaL BACKGROUND beat or respiration, and which were independ- 
The important investigations of du Bois— ent of the animal’s physical movements. Like 
Reymond (21) (1848) clearly demonstrated Caton (17) (1875), he showed that a strong 
some of the electrical properties of living Current oscillation was set up in the occipital 
tissue. His researches shed much light on lobes if the eyes were stimulated with a bright 
demarcation currents and action currents of — light. ; ; ; : 
muscle and nerve, but it was not until 1875 In collaboration with Cybulski, Beck (8) 
* This program of research was supported in part by a (2°92) continued his work with the brains 
program researc a upf Pp ) ed 
eries of grants from the Special Research Fund of the *The material presented in this paper had originally been 


niversity of Wisconsin. The balance of the funds was sup deposited by Fieischl von Marxow with the Imperial Academy 
ed by the School of Education of Vienna as a sealed manuscript in 1883 


In the realm of education, and human be- 
havior generally, the role of the central nerv- 
ous system is admittedly great. The present 
program of research utilizes electrical poten- 
tials of the brain as indicators of activity 
within the central nervous system. It is 
thought that these electrical signs of brain 
activity may have important consequences for 
our understanding and evaluation of human 
behavior. 
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of monkeys and dogs in an attempt to show 
that the currents in the cortex are self-exist- 
ing and are not transmitted currents. As a 
result of their work, they concluded that 
these electrical phenomena of the cerebral 
cortex correspond to simple psy hic condi- 
tions and are not due to irrelevant physio- 
logical functionings of the organism. 

Gotch and Horsley (37) (1891), working 
with cats, rabbits, and monkeys, and using 
Lippmann’s capillary electrometer, failed to 
record currents when the animal was at rest 
but found distinct oscillations with peripheral 
stimulation These were also found by 
Danilewsky (18) (1891). 

In 1904, Tchiriev (85), who was working 
with these cortical potentials, reached the 
conclusion that the potential changes were 
dependent upon the movement of the blood 
in the brain and therefore could not repre- 
sent the functional state of the central nervous 
system. 

In 1912, Kaufmann( 46), using an improved 
electrical system, Wiedemann’s galvanometer, 
was able to disprove Tchiriev’s (85) (1904) 
point of view. With his more sensitive appa- 
ratus he was able to record these regular, 
spontaneous oscillations from the skull of the 
animal. He was able to verify the existence 
of potential variations with peripheral stim- 
ulation. 

Prawdicz—Neminski (66) (1913), using the 
new string galvanometer, established the in- 
fluence of peripheral stimulation and verified 
the results of Beck and Cybulski (8) (1892). 

In 1925, Prawdicz—-Neminski (67), using 
non-polarizable electrodes and the large Edel- 
mann string galvanometer, attempted with 
Beck (7) (1890), Danilewsky (18) (1891), 
and Kaufmann (46) (1912) to establish the 
existence of spontaneous fluctuations of cur- 
rent in the cerebral cortex. Working with 
dogs, he made simultaneous records of the 
“electrocerebrogram”, cerebral pulse, and 
blood pressure, and arrived at the conclusion 
that, contrary to Tchiriev’s (85) (1904) 
view, these fluctuations were not the result 
of friction of the blood on the walls of the 
cerebral vessels, but rather that they were re- 
lated to certain psychical processes since they 
disappeared before complete arrest of cere- 
bral circulation. |Prawdicz—Neminski (67) 
(1925) was also able to demonstrate the ex- 
istence of waves of the first and second order, 
appearing 10 to 15 a second and 20 to 32 a 
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second respectively, even when conductior 
skull of the 


were made from the intact 
animal. 

The majority of the authors cited above 
believed these cortical potential oscillations to 
be the expression of the activity of the cere- 
bral cortex of the animal, since they varied 
with changes in cortical function and disap- 
peared if the central nervous system was un 
der the influence of a narcotic or if the anima! 
expired. A distinction was also made between 
the regular existing current which could be 
conducted from the cortex while the animal 
was at rest, and the variations in this current 
produced by peripheral stimulation. These 
latter oscillations were particularly sensitive 
and disappeared with the cooling of the cor- 
tex and from inexplicable causes. 

In 1910, Berger (9) (1929), using the smal 
Edelmann string galvanometer and _non- 
polarizable electrodes in his work with dogs, 
noted regular, minute string oscillations when 
the animal was not under the influence of 
external stimuli, if the electrodes were placed 
in symmetrical positions on the cortex. More- 
over, he was unable to elicit any change in 
these oscillations upon peripheral stimulation 

Some years later, Berger (9) (1929) began 
a series of experiments on dogs, using the 
large Edelmann string galvanometer and a 
modified form of Siemen’s and _ Halske’s 
double-coil galvanometer. Special precautions 
were taken to prevent cooling and evaporation 
from the cerebral cortex by inserting zin 
plate electrodes into the subdural cavity and 
by filling the trephined points with bone wax 
Berger was then able to show the existence of 
regular current oscillations when the elec- 
trodes lay over the right and left hemispheres, 
as well as at two points on the same hemi- 
sphere. He was also able to record simul- 
taneously the “‘cerebrogram” and the electro- 
cardiogram. In an attempt to show that these 
cerebral oscillations were not due to filling of 
the veins and arteries, and breathing, the 
upper cervical spinal cord was cut in the ex- 
perimental animal under observation. Breath- 
ing ceased, and finally, after a short time, the 
heart beat ceased also. The electroenceph- 
alogram, however, which was conducted from 
both hemispheres of the dog, continued after 
the heart beat had ceased. The brain oscil- 
lations changed only in-so-far-as they became 
more regular. On the basis of these experi- 
ments Berger concluded that the cerebral cur- 
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t oscillations could not be merely mechan- 
| consequences of cerebral blood move- 
nts or of breathing behavior. 

This experiment remains perhaps the nicest 

rification of the theory of cortical origin of 

1in potentials brought forward up to this 
int, and was in complete harmony with the 
views held by Beck (7) (1890), Danilewsky 

‘8) (1891), Kaufmann (46) (1912), and 

thers. 

Berger (9) (1929) was further able to 
verify the observations of Prawdicz—Neminski 

67) (1925) who had demonstrated the pres- 
ence of two distinct waves, which he called 
first and second order waves, appearing 10 to 
1s times a second and 20 to 32 times a second 
respectively. According to Berger’s results, 
the amplitude of the current oscillations con- 
ducted from the brain surface of dogs reached 
an average of 200 to 600 microvolts for the 
larger go-100 millisecond waves and 130 
microvolts for the shorter and smaller 40-50 
millisecond waves. 

Having demonstrated the presence of spon- 
taneous electrical activity in the brains of 
dogs and monkeys, Berger began his pioneer 
work which resulted in his demonstration of 
the presence of the electroencephalogram, or 
Berger Rhythm, in the brains of human 
beings. 

In 1924, while working with a 17 year old 
youth who had been trepanned palliatively 
above the left cerebral hemisphere for a sus- 
pected tumor, Berger (9) (1929) (after in- 
serting a high resistance platinum and quartz 
wire, i.e. 5200 and 3200 ohms, into his cir- 
cuit) succeeded in receiving regular oscilla- 
tions of the galvanometer strings when both 
electrodes were in the region of the trepanna- 
tion and about 4 c.m. apart. This original 
discovery was made with the small Edelmann 
string galvanometer and no record was pos- 
sible with this apparatus. He was later able, 
with the aid of a Siemens and Halske double- 
coil galvanometer, to verify the original ob- 
servation. He obtained tracings by introduc- 
ing needle electrodes extra-durally through 
the cite of the trephine opening in previously 
operated patients, and also got similar curves 
from normal persons using lead foil electrodes 
applied to the scalp. He found it possible to 
record these regular oscillations, which are 
distinguishable by waves of two character- 
istics, having an average duration of 90 and 
35 milliseconds, the amplitude of the large 
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wave amounting to 150 to 700 microvolts and 
that of the 35 millisecond wave to 20 to 30 
microvolts. 

Berger believed these potential oscillations 
were due to electrical changes within the brain 
of the individual, and held that they were 
independent of other physiological functions. 
The alpha or “Berger” rhythm appears at a 
frequency of about ten a second in normal 
adults, and has been assumed to represent the 
spontaneous electrical function of the resting 
cortex. This rhythm can best be recorded in 
conditions of mental repose when the eyes 
are closed. Sensory (i.e. tactual, auditory, 
and visual) stimulation and mental activity 
tend to diminish or even to abolish the wave. 
The cortical rhythms seem to be profoundly 
affected by certain diseases of organic nature, 
as well as by the effects of various narcotics 
which affect the central nervous system. Dur- 
ing natural sleep the amplitude of the waves 
is diminished. Berger (12) (1932) found the 
frequency of the waves to be partly a func- 
tion of age and also found no well-established 
rhythm in children under one year of age. In 
children of from three to four months he 
found a wave of lower frequency and longer 
duration than in normal adults. This devel- 
opmental trend continued up to the age of 
five years, the upper age limit of his experi- 
mental group, where he found values of 110 
and 120 milliseconds which closely approach 
those of the adult. Lindsley (57) (1936) and 
Smith (72) (1937) also found developmental 
trends in the frequency of the alpha wave 
with an increase up to the age of eight years 
where the adult frequency was found. 

Berger (9) (1929) thought of the electrical 
activity which he studied as emanating from 
the entire cortex, but Adrian and Matthews 
(2) (1934) have held that it originated in the 
occipital lobe of the brain and was closely 
related to visual functions. Still more re- 
cently Jasper and Andrews (43) (1938) have 
presented results tending to confirm Berger’s 
original position. 

There is general agreement as to the fre- 
quency of the alpha waves in adults (10 to 
10.5 per second with an accepted range of 
from 8 to 13 a second), but considerable dis- 
agreement as to the frequency range of the 
beta rhythms. Berger (9) (1929) mentioned 
beta waves with frequencies of from 20 to 25 
per second. A year later (10) (1930) he re- 
ported beta frequencies of from 25 to 33 a 
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second. In 1932 and 1933 Berger (12) (13) 


published a frequency range of from 20 to 


so a second. Gibbs, Davis, and Lennox (34) 
(1935) quote Berger as having found a range 
of frequencies from 50 to 60 per second. 
Jasper and Carmichael (44) (1935) obtained 
an average frequency of 25 for the beta waves, 
but also reported beta frequencies ranging 
from 25 to 50 asecond. Foerster and Alten- 
burger (28) (1935) reported beta waves with 
an average frequency of 33.3 a second. Davis 
and Davis (19) (1936) obtained a range of 
from 18 to 50 a second with an average fre- 
quency of 25 a second for the beta waves. 
Lemere (54) (1936) found a range of from 
r8 to 35 a second. Liberson (56) (1936), 
after a survey of the literature, placed the 
average range of the beta waves between 25 
and 40 a second. Jasper and Andrews (43) 
(1938), in a quite recent publication, place 
the frequency range of the beta waves be- 
tween 17 and 30 per second, with an average 
frequency of 25. All experimenters are 
agreed that the magnitude of the beta waves 
is considerably less than that of the alpha 
waves. This means that the difficulty of de- 
tecting and eliminating artifacts is greatly 
increased. 

In a recent publication Berger (15) (1937) 
retains his original position with respect to 
the existence of the beta waves as a separate 
and distinct order of waves, but modifies his 
interpretation somewhat. He now believes 
that beta waves originate somewhere in the 
outer three layers of the cortex and represent 
the psycho-physiological activities of the 
brain. He further believes that the appear- 
ance of these waves during mental activity is 
due to an increase in the amplitude of the 
wave itself and is not merely the result of the 
removal of the alpha rhythm which, when 
present, obscures the beta waves. 


A third phenomenon observed in the elec- 
trical activities recorded from the human 
brain is the appearance of periods of inac- 
tivity in the production of alpha waves. 
These latent periods, which are of the order 
of one second, are less fixed and constant 
and have as yet been little studied. 

\fter the excellent pioneer experiments of 
Berger had been confirmed by originally 
skeptical Adrian and Matthews (2) (1934), 
by Jasper and Carmichael (44) (1935), and 
by Gibbs and Davis (33) (1935), interest in 
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the human electroencephalogram became very 
great. 
II]. PRoor oF BRAIN ORIGIN 


Tchiriev (85) (1904), as has already been 
said, believed the potential changes which he 
found in his researches to be dependent upon 
the movements of the blood in the brain. 
Danilewsky (18) (1891), Beck (7) (1890), 
Prawdicz—Neminski (67) (1925), and Berger 
(9) (1929), on the other hand, held that these 
oscillations were due, at least partly, to some 
activity within the brain, independent of cere- 
bral blood movements. Since then, numerous 
control experiments have been conducted in 
an endeavor to rule out the possibility that 
these potential oscillations, recorded from the 
surface of the head, are caused by some or- 
ganic process other than brain activity, and 
to demonstrate their cortical origin. One 
must, of course, always be on the alert for 
artifacts, and it is often not easy to distin- 
guish these with precision from brain 
potentials. 

Because of their form and frequency, the 
potentials of the electroencephalogram can be 
distinguished from muscle action currents. 
Adrian and Matthews (2) (1934) have shown 
that if the electroencephalogram were due to 
a clonus or tremor of the orbital musculature, 
then active and passive movements of the eye 
ball should give corresponding waves. They 
argued against the orbital origin of these po- 
tential changes by pointing out that the ex- 
ternal eye muscles are deeply buried in the 
orbit and that their action currents could have 
but little effect on electrodes placed on the 
scalp. They presented experimental evidence 
to show that neither active movements of the 
eyeball, produced by looking at the spokes 
of a revolving wheel, nor passive movements 
produced by the oscillations of the eyeball, 
when in contact with a clockwork driven rod, 
give a corresponding potential wave. They 
further presented experimental evidence to 
show that movements of the head and neck, 
and wrinkling of the forehead and scalp, pro- 
duce electromyograms, but do not alter the 
electroencephalogram in any way. 

If the potentials of the electroencephalo- 
gram were due to clonus or tremor of the 
orbital musculature, then the potential gra- 
dient would be at a maximum in the neigh- 
borhood of the eye; a conclusion which is not 
verified by the data. (Adrian and Matthews 
(2) (1934)) If the potentials were of muscle 
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n, they should be greatest on the skin 
ice, but Berger (9) (1929), (12) (1932 

ian and Matthews (2)(1934); Jasper and 

( armichael (44) (1935); and Jasper and 

irews (42) (1936) have shown that the 

ylitude of the oscillations increases when 

electrodes are placed directly in contact 

th the cortex, on the periosteum of the 

ill, or over a trephine opening in the skull. 

lhe potentials due to scalp movements, eye 

vements, eye blinks, head movements, and 
arrectores pilorum contractions were carefully 
ruled out by Berger (12) (1932), and have 
been shown to be clearly distinguishable from 
the regular alpha and beta waves, in fre- 
quency as well as in form and amplitude, by 
Jasper and Andrews (42) (1936). 

Adrian and Matthews (2) (1934) have fur- 
ther shown that the wave could not be due 
to retinal potentials since electrodes placed 
on the scalp could not pick up even the poten- 
tial changes caused by illuminating the eye 
suddenly with a bright light. 

Simultaneous electrocardiograms and elec- 
troencephalograms show no apparent rela- 
tionship between heartbeat and alpha and 
beta waves. Berger (9) (1929), (12) (1932) 
was able to show that even a momentary ar- 
rest of the heartbeat did not produce appre- 
ciable effect on the electroencephalogram of a 
dog, and that the brain rhythm may continue 
even after the heart has ceased to beat. 

Respiration curves are not related to the 
electroencephalogram as has been pointed out 
by Berger (9) (1929) and Lindsley and Ru- 
benstein (58) (1937). Simultaneous record- 
ing of the cerebral plethysmogram and the 
electroencephalogram by Berger (11) (1931) 
also showed that there was no relationship 
between brain pulse or volume change and 
the electroencephalogram. 

According to Berger (11) (1931), the elec- 
troencephalogram during normal sleep ap- 
pears to be ciminished in amplitude with no 
apparent change in frequency, although 
Loomis, Harvey, and Hobart (59) (60) 
(1935) have shown that the changes in the 
electroencephalogram during sleep are more 
complex than Berger held. 

The alpha wave of the human electroen- 
cephalogram is augmented in amplitude after 
a cocaine injection and decreases after a large 
dose of scopolamine. In like manner, it de- 
creases in deep general anesthesia, just as it 
increases in height during excitation periods. 
(Berger (11) (1931)) The effect of the bar- 
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biturates is quite different, giving an appar- 
ent increase in magnitude, and a grouping of 
alpha waves with some frequency changes. 
(Berger (9) (1929), Adrian and Matthews 
(2) (1934)) 

An altered electroencephalogram is_re- 
corded when the electrodes are placed over 
abnormal brain tissure if the cortex is affected. 
(Foerster and Altenburger (28) (1935)) 
Under the influence of pressure on the brain 
from tumors, cerebri hydrocephialus, and 
intra-cranial bleeding, the electroencephalo- 
gram undergoes a change in which the alpha 
waves are lengthened. (Berger (11) (1931), 
Walter (86) (1936)) 

In the unconsciousness of the epileptic fit, 
and in deep narcosis, the alpha waves are 
markedly modified or may be entirely lacking. 
(Berger (11) (1931), (12) (1932); Gibbs, 
Davis, and Lennox (34) (1935); Gibbs, Len- 
nox, and Gibbs (35) (1936), (36) (1936)) 

As another bit of evidence to establish the 
brain origin of the observed electrical activ- 
ity, Berger (11) (1931) has shown that the 
characteristic electroencephalogram can be 
traced from the cortex of the brain but not 
from the brain stem. Recently, however, 
Spiegel (74) (1937) has been able to record 
from the thalamic nuclei of the thalamus 


curves which correspond closely to the alpha 
and beta waves found by Berger in his elec- 
troencephalograms. 

It has also been experimentally shown that 
concentration of attention, as in solving arith- 
metical problems, tends to diminish or to 
abolish the alpha waves of the electroenceph- 


alogram. These researches by Berger (11) 
(1931), (15) (1937); Adrian and Matthews 
(2) (1934); Foerster and Altenberger (28) 
(1935); and Rohracher (69) (1935) tend to 
show the close relationship between the elec- 
trical activity of the cortex and psychic 
function. 

Various forms of sensory stimulation, ac- 
cording to Berger (11) (1931), Jasper and 
Carmichael (44) (1935), and Travis and 
Gottlober (80) (1936), may tend to diminish 
or abolish the alpha waves, after a latency 
period of from 0.2 to 0.4 seconds. Loomis, 
Harvey, and Hobart (59) (60) (1935) have 
shown that sensory stimulation of the sleep- 
ing subject may also cause bursts of alpha 
potentials without awakening him. 

Dusser de Barenne and McCulloch (24) 
(1936) have produced evidence of a different 
nature to show that the electroencephalogram 








JOURNAL OF EXPERIMENTAL EDUCATION 


is of cortical origin. Working with the ex- 
posed cortex of animals, they found that 
thermo-coagulation at 80° C. for 5 seconds 
killed the entire cortical thickness, and imme- 
diately and completely abolished all charac- 
teristic action potentials. 


IV. ReEcOoRDING TECHNIQUES 


A. Apparatus 

The electrical potentials of cerebral origin 
recorded from the surface of the skull vary 
from a few microvolts to about 1000 micro- 
volts. Therefore, any galvanometer or other 
apparatus designed to register these cortical 
potentials must be extremely sensitive to be 
able to pick up and record the minute poten- 
tial variations. Not only must the apparatus 
be extremely sensitive, but it should be 
capable of faithful reproduction of the form 
of the potential variations involved. Since 
the duration of these potentials varies within 
very wide limits, the apparatus should be 
equally sensitive to potential variations from 
one to at least one hundred per second. The 
frequencies of the waves characteristic of the 
human electroencephalogram vary from 2 to 
13 a second for the alpha (taking into con- 
sideration young children, special experimen- 
tal conditions, and certain pathological cases), 
and from 17 to 50 a second for the beta wave. 
Consequently, unless one wishes to deliber- 
ately exclude all waves above and below a 
specific limit, as may be the case under cer- 
tain circumstances, the amplification and re- 
cording system must have a sensitivity suffi- 
ciently large to cover this frequency range. 

The earlier researches of Berger (9) (1929) 
were conducted with the Edelmann String 
Galvanometer, and later, with a Siemens and 
Halske Double Coil Galvanometer. 

With the introduction of the use of the 
electron-tube amplifier for the study of brain 
rhythms, an extremely sensitive and useful 
tool was made available for the amplification 
of these minute potential oscillations. Since 
then amplifiers, suitable for the magnification 
of these brain potentials up to a point where 
even quite minute potential variations can be 
studied, have been designed and built. Con- 
denser-coupled amplifiers are generally used, 
but for some special purposes a direct-coupled 
amplifier has been found advantageous. For 
simultaneous recording from different brain 
areas a balanced input amplifier is necessary. 
Among the amplifier systems which have been 
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proposed are those of Scheminsky (70) 
(1928); Matthews (63) (1928); Bartley and 
Newman (6) (1930); Fessard (26) (1932): 
Matthews (64) (1934); Garceau and Davis 
(30)(1934); Spiegel (73) (1934); Jasper and 
Andrews (42) (1936); Koopman and Hoe- 
landt (47) (1936); and Huddleston, White- 
head, and Moritz (40) (1936). 

These amplifiers have been used in con- 
junction with oscillograph systems differing 
from one another in various ways, particu- 
larly as to the method of recording and as to 
their ability to follow various frequencies. 
The Matthews optical type oscillograph has 
been used by Adrian and Matthews (2)(3) 
(1934), Wang (87) (1934), Adrian and 
Yamagiwa (4) (1935), and _ Lemere (54) 
(1936). Kreezer (52) (1936); Jasper and 
Andrews (42) (1936); Travis and Gottlober 
(81) (1937); Travis, Knott and Griffith (84) 
(1937); and Gottlober (38) (1938) have al! 
used the Westinghouse Oscillograph. 

In many ways the cathode ray oscillograph 
is the ideal recording instrument since the 
stream of electrons which it uses has neither 
appreciable mass nor damping, and it permits 
faithful registration of frequencies up to at 
least 1,000,000 cycles per second. Gasser 
and Erlanger (32) (1922) pioneered the use 
of this instrument in physiological work, but 
only recently has it come into use for the 
study of the electrical activity of the cerebral 
cortex. (Dusser de Barenne and McCulloch 
(24)(1936), Walter (86)(1936), and Blake 
(16) (1937)) Among those who have de- 
scribed setups for utilizing this potent instru- 
ment are Schmitz (71) (1933); Garceau and 
Davis (30) (1934); Koopman and Hoelanct 
(47) (1936); McCulloch and Wendt (62) 
(1936); Huddleston, Whitehead, and Moritz 
(40) (1936); and Gans (29) (1937). 

Partly to overcome the expense of photo- 
graphically recording the oscillations of the 
moving spot on the cathode ray tube, various 
ink-writing oscillographs have been developed. 
(Toennies (76) (1932), (77) (1933); Adrian 
and Matthews (2) (1934); Garceau and Davis 
(31) (1935); Loomis, Harvey and Hobart 
(59) (1935); and Offmer and Gerard (65) 
(1936)) These oscillographs have a maxi- 
mum frequency of about 40 to 60 a second, 
but have an advantage in that they are rela- 
tively inexpensive to operate and can be read 
instantaneously. 

Some other experimenters have employed 
loud-speakers as an added means of follow- 
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the potential oscillations of the electro- 
vhalogram. (Adrian and Matthews (2) 
24); Garceau and Davis (30) (1934): 
Huddleston, Whitehead, and Moritz (40) 


)) 


Electrodes 
Several kinds of electrodes have been em- 
ved for the conduction of electrical poten- 
s from the cortex and the surface of the 
skull or scalp by workers in this field. 
Berger (9)(1929) used fresh amalgamated 
plates about 12 mm. by 4 mm., the four 
rners of which were rounded off in order to 
prevent injury. A well-insulated wire was 
soldered to the plates, which were inserted 
through a slit in the dura next to the surface 
of the cortex of the experimental animal. In 
subsequent work with human beings, Berger 
(o)(1929) used steel needle electrodes which 
were zincified and insulated up to the point 
with a coat of lacquer. Funnel electrodes 
were also used by Berger (9) (1929), but, due 
to the danger involved in the use of the zinc 
sulfate solution, they were employed only for 
standard records for purposes of comparison. 
Metal electrodes of round copper, plati- 
num, or silver plates wrapped in a somewhat 


larger piece of flannel soaked in a twenty per- 
cent sodium chloride solution, were employed 


by Berger(g)(1929). Jasper and Carmichael 
(44) (1935) and Lemere (54) (1936) used 
silver discs, 1 to 2 cm. in diameter, covered 
with flannel soaked in sodium chloride solu- 
tion. Adrian and Matthews (2)(1934) em- 
ployed electrodes analogous to the plate elec- 
trodes of Berger (9) (1929) but of smaller 
diameter. Lead foil electrodes placed be- 
tween two layers of flannel soaked in 20% 
sodium chloride solution were employed by 
Berger (9)(1929). These metal electrodes 
were generally applied to the clean and hair- 
less scalp, and were held in place by thin 
rubber ribbons, which also prevented drying 
of the flannel pads during the course of the 
experiment. Davis and Davis (19)(1936) 
held their plate electrodes in place with San- 
born’s Electrode paste and collodion. 
Non-polarizable, _ silver-silver, chloride 
needle electrodes, well-insulated up to the 
point by a coat of lacquer, have been used by 
Berger (10)(1930); Dusser de Barenne and 
McCulloch (24) (1936); Jasper and Car- 
michael (44)(1935); Travis and Knott (82) 
(1936); and Travis, Knott, and Griffith (84) 
(1937). The silver-silver chloride wire was 


ELECTROENCEPHALOGRAPH} 


prepared according to the method of Stadie, 
O’Brien, and Laug (75) (1931). 

Concentric needle electrodes, made by pass- 
ing a small wire down the shaft of a hypo- 
dermic needle, have been used by Wang (87) 
(1934); Wang and Lu (88)(1936): Gibbs, 
Lennox, and Gibbs (35) (36)(1936); and by 
Adrian and Bronk (1)(1929). McCulloch 
and Dusser de Barenne (61)(1936) used a 
concentric silver-silver chloride agar electrode 
of 4 mm. internal and 6 mm. external 
diameter. 

Walter (86)(1936) used silver-silver chlo- 
ride pad electrodes held in place by a special 
cap and moistened with salt solution. For 
work with the exposed brain he used sterilized 
silver-silver chioride wick electrodes filled 
with sterile normal saline. 

In working with epileptic subjects, Gibbs, 
Davis, and Lennox (34)(1935) found a dif- 
fused crown electrode best, since it eliminated 
the possibility of injury from needle elec- 
trodes as a result of violent movements on 
the part of the subjects. 

Kreezer (52)(1936) employed a small coil 
of silver wire attached to a piece of rubber 
sponge which was held on the head by means 
of an elastic band. Contact with the skin 
was made with saline electrode paste. 

Dietsch (20) (1932) believed silver-silver 
chloride needles subject to polarization and 
employed silver plates, about 5 mm. in 
diameter, covered with spongy platinum. 

Hoagland, Rubin and Cameron (39) 
(1937), in their work with schizophrenic 
cases, used electrodes made from small lead 
pellets, about 2—3 mm. in diameter, cemented 
to the scalp with collodion, and making con- 
tact with the scalp through a salt electrode 
paste. 

Adrian and Yamagiwa (4)(10935) have de- 
veloped electrodes consisting of a small coil 
of silver wire, coated with silver chloride con- 
tained in a small glass tube filled with gelatin 
jelly made up with saline and plugged with 
a bit of absorbent cotton. The glass tube 
was held in a rectangular slab of rubber which 
was bandaged to the surface of the head. 

Jasper and Andrews (42) (1936) used elec- 
trodes similar to those of Adrian and 
Yamagiwa (4) (1935), but introduced the 
silver-silver chloride wire into a glass T-tube 
filled with 10% sodium chloride solution, the 
open end of which was stopped with a bit of 
absorbent cotton. The closed end of the 
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T-tube was passed through a sponge rubber 
block, and was held in place on the head by 
an elastic band. Contact with the scalp was 
made with the moist cotton of the stopped-up 
end of the T-tube. 


More recently, Jasper and Andrews (43) 
(1938) have abandoned this type of electrode 
in favor of a simpler kind. The new elec- 
trode consists of a small, hat-shaped object 
made of chiorided silver with felt-covered 
brims. These electrodes have an_ inside 
diameter of 5 mm. and are fixed to the head 
with collodion and electrode jelly. In gen- 
eral, different types of electrodes have been 
employed by the various workers with more 
or less success in specific cases, while working 
with different subjects, and using various 
amplifier systems. 

Special problems, such as localization on 
the exposed cortex, work on young children, 
or on violent epileptic subjects, obviously re- 
quire special electrodes. For general pur- 
poses, Jasper and Andrews (42) (1936) list 
the following as desirable characteristics of 
an electrode: First, the resistance of the 
electrode and the skin contact should be as 


low as possible. Second, contact with the skin 
should remain constant throughout the course 
of the experiment, and no potential disturb- 


ances should arise from this contact. Third, 
the electrodes should also permit convenient 
and efficient attachment to any point on the 
head surface, so that they will not be dis- 
turbed by head movements. Finally, they 
should be comfortable for the individual to 
whom they are attached. 


Jasper and Andrews (42)(1936) seem to 
favor surface electrodes in preference to needle 
electrodes inserted through the skin to the 
periosteum, because they are more convenient 
and comfortable, and because no anesthesia or 
asepsis is necessary. Their records, taken from 
a pair of surface electrodes simultaneously 
with records from needle electrodes, show no 
qualitative differences between the two meth- 
ods of recording except that in some records 
from the needle electrodes the contact arti- 
facts have a greater amplitude. If the elec- 
trodes are brought together up to a distance 
of 2 cm. from each other, the needle electrodes 
may pick up from 25% to 30% more poten- 
tial than the surface electrodes directly above 
them. This difference is much less if the 
electrodes are farther apart. 
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Berger (12) (1932), Kornmueller (40) 
(1933), and Jasper and Carmichael (44) 
(1935) have also shown that simultaneous 
records taken from a pair of needle electrodes. 
inserted through the scalp to the periosteum, 
and a pair of surface electrodes on the scalp 
directly above, are practically identical in 
form, although the potentials picked up by 
the needle electrodes are generally slightly 
larger. 

Greater stability and freedom from contact 
artifacts, such as arise from cut tissue, are 
obtained from surface electrodes. Jasper and 
Andrews (42)(1936) further found that the 
resistance between needle electrodes inserted 
through the skin is usually of the same order 
of magnitude as that of the surface electrodes. 


C. Placement of Electrodes 

Two distinct methods of electrode place- 
ment, the bipolar and the monopolar, have 
been used in the study of the electrical activ- 
ity of the cortex. The pioneer researches of 
Berger (9)(1929), (10)(1930), (11) (1931), 
dealing with human subjects, were conducted 
upon trephined individuals, and the needle 
electrodes were inserted through the scalp 
within the region of the cite of the trephine 
openings at points at least 15 mm. apart. 
Berger believed that the entire cortex was 
equally active, and hence both electrodes had 
to be active as applied to the source of the 
potential activity. Later Berger (14) (1935), 
in recording transcortical potentials from nor- 
mal human subjects, placed the electrodes at 
opposite extremities of the skull. Most fre- 
quently he placed one electrode at the level 
of the frontal bone and the other on the con- 
tralateral occiput, as did also Lemere (54) 
(1936). Jasper and Carmichael (44) (1935) 
and Jasper and Andrews (42)(1936), (43) 
(1938) used ipsolateral and contralateral 
placements. 

After the work of Adrian and Matthews 
(2)(1934), the electrodes were placed most 
frequently a short distance apart at any 
determined level of the skull. 

Loomis, Harvey, and Hobart (59) (1935) 
placed their electrodes on the high forehead 
and crown of the head. In later recordings 
from double amplifiers, they placed electrodes 
in the midline on the forehead, the crown, 
and the occiput, the amplifiers being con- 
nected between forehead and crown electrodes 
and between crown and occiput electrodes. 
Kreezer (52) (1936), using a similar double 
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lifier arrangement, placed his electrodes 
- inch to the right of the median plane, 
the right occipital area, right motor area, 
| the anterior part of the frontal area. 


[he unipolar method owes its origin, in 
rt. to the work of Adrian and Matthews 
(1934) who were led to the conclusion 
that the alpha wave originated in the occipital 
lobes, even though it could be picked up from 
ross various parts of the head. Their scheme 
indicates a single source of oscillating poten- 
tials located in the occipital region. They 
concluded that the position of one of the 
electrodes was of little importance (they 
called it an “indifferent’’ electrode), as long 
as the active electrode was placed on the 
occipital portion of the head. 


Kornmueller (48) (49) (1933), (50) (51) 
(1935); Toennies (76) (1932), (77) (78) 
(1933), (79)(1935); Dusser de Barenne and 
McCulloch (23) (1936); and Foerster and 
Altenburger (28) (1935) have used this prin- 
ciple of the “indifferent” electrode in their 
work on animals. These experimenters ap- 
plied the active electrode to various points on 
the skull or cortex and placed the “‘indiffer- 
ent” electrode on the eye or ear of the animal. 


Davis and Davis (19) (1936); Adrian and 
Matthews (2) (1934); Travis and Gottlober 
(80)(1936), (81)(1937); Travis and Knott 
(82)(1936); Durup and Fessard (22) (1936); 
and Gibbs, Lennox, and Gibbs (35)(36) 
(1936) all used monopolar electrode place- 
ment in their experiments with human beings 
and also spoke of “active” and “inactive” 
electrodes. The inactive, indifferent, or 
ground electrode is most often placed on the 
ear lobe or neck of the subject, and the poten- 
tial activity is assumed to originate in the 
immediate vicinity of the active electrode. 


Davis and Davis (19)(1936) placed their 
active electrode on the top of the head at a 
point just above the occipital protuberance 
on the midline, a position corresponding 
roughly to the motor and visual cortex, the 
reference electrode being on the left ear of 
the subject. 


Travis and Gottlober (80) (1936), (81) 
(1937) placed their active electrode over the 
left occipital area, as did Travis and Knott 
(83)(1937) and Durup and Fessard (22) 
(1936). Travis and Gottlober (80) (1936) 
chose the right motor area. All, however, 
used the lobe of the left ear as the location 
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for their reference, or “indifferent”, electrode. 
Bartley and Bishop (5) (1933) reached the 
conclusion that no truly “indifferent” lead 
was possible. Jasper and Andrews (42) 
(1936) also criticized the notion of “indif- 
ferent” or ground electrodes. They held that 
when electrodes are applied to the body the 
potential changes led off are always due to 
the sum total of the e.m.f. producing activities 
occurring between the electrodes, and also 
that unless one assumes that the entire brain, 
except the occipital lobe, is electrically dead, 
it seems improbable that a truly “indifferent” 
electrode can be placed on the head. 

In attempting to localize human cortical 
potentials through the skull, Jasper and 
Andrews (42) (1936) found bipolar leads 10 
to 20 mm. apart to be somewhat better than 
monopolar leads. Rheinberger and Jasper 
(68) (1937) reported better differentiations 
of simultaneous electroencephalograms from 
different brain areas by the bipolar method 
of recording. In a still later publication, 
Jasper and Andrews (43)(1938) used the 
diffused lead taken from the ear lobe as a 
check, but maintained that, except under spe- 
cial conditions, the diffused lead technique 
did not give as good localizations. These 
same authors have also shown that it is pos- 
sible to work out standard placements for 
bipolar electrodes which will permit a high 
degree of differentiation between the various 
regions of the brain. 

After a general review of the subject Jasper 
(41) (1937) concluded that: “Both the mono- 
polar and bipolar methods have their advan- 
tages and disadvantages. The selection of 
one method or the other should be determined 
by the purpose of the particular experiment’. 


(p. 420) 


V. DESCRIPTION OF THE APPARATUS AND 
RECORDING TECHNIQUES USED IN THE 
CHILD DEVELOPMENT LABORATORY OF THE 
UNIVERSITY OF WISCONSIN 


Figure 1 presents the floor plan of the Wis- 
consin laboratory, while figure 2 shows, in a 
schematic fashion, the interrelationships of 
the various items of apparatus. Figure 3 
shows a corner of the shielded room in which 
our main low-frequency amplifiers are located. 
This room is 9'% feet long, 71 feet wide, and 
7% feet high. The walls and ceiling are lined 
with fine mesh copper screen, the floor is 
covered with solid copper sheeting, all joints 
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of which are soldered together, and the entire 
setup is grounded at one point. All items 
of equipment in this room are battery oper- 
ated and communication with the outside is 
effected by means of a pneumatic switching 
arrangement 


The main low-frequency amplifiers* em- 
ployed in this laboratory (figures 3, 4, 5, 6, 
and 7) are three stage push-pull high gain 
amplifiers, all tubes pentodes, and battery 
make voltage 
excess of twenty million, 
and are capable of operating at a noise 
level of slightly less than 2 microvolts in the 
input. With preparations of an ordinary 
value of resistance, one, and even half micro- 
volt signals, can be distinguished. The out- 
put is suitable for operation of: the cathode 
ray tubes, the single stage consisting of a 
single 6A6 in push-pull connection, A.C. or 
battery powered, and the power amplifier, 
which in turn feeds the dynamic speaker and 
the ink recorders. 


powered, which possible a 


amplification in 


In the design and construction of these 
amplifiers great pains were taken to assure 
adequate response to frequencies between 1 
and 100 cycles per second in order to insure 
maximum usefulness in the study of the elec- 
troencephalogram. The amplifiers will, how- 
ever, pass signals with somewhat less gain up 
to 10,000 cycles per second. 


The relative impossibility of properly bal- 
ancing push-pull stages with commercially 
available components led to the use of a 
single tube in the second stage with the plate 
of the lower input push-pull tube coupled to 
ground. The fact that one-half of the signal 
is thrown away is more than compensated for 
by the diminution of difficulties which would 
otherwise be encountered. The purpose of 
the adjustable condenser coupling by means 
of the tap switch between the first and second 
stage is to eliminate or control 60 cycle 
interference and low-frequency oscillation 
if either cause trouble. With tap switch 
to smaller condensers, the time constant is 
lowered and the circuit is less sensitive to low 

* The original amplifiers used in this research were designed 
and built by Mr. Lovett Garceau, of the Electro-Medical Lab- 
oratory, Inc. of Holliston, Massachusetts The circuit was 
lated January 30. 1936. Since then Mr. Edwin Bernet has 


furnished valuable technical aid, and has made changes in 
design and construction to keep the amplifiers up to date 
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frequencies. The use of five-inch cathode ray 
tubes in the recording unit requires consid- 
erable plate voltage which necessitates push- 
pull in the last stage. This is done by a 
simple and standard phase inversion scheme 
evident from the diagram. (figure 6) The 
particular scheme of phase inversion used is 
not linear at higher frequencies, but is quite 
satisfactory over the range covered in the 
present experiments. 


An adjustable potentiometer makes it pos- 
sible to set the amplitude of the phase invert- 
ing circuit so that the signals in each half of 
the third stage are balanced. A special cal- 
ibrating unit makes it possible to check on 
the performance of the amplifiers at all times. 
As part of the calibrating equipment a Gen- 
eral Radio Type 377-B Low-Frequency Oscil- 
lator, with a frequency range of from ro to 
70,000 cycles, is used. This unit also fur- 
nishes time lines in connection with the 
photographic recorders. 


The amplifiers used in this research incor- 
porate such conveniences as very simple 
switching mechanisms supplied solely by the 
automatic action of the plugs in the jacks, 
perm 'tting the use of either grounded or bal- 
anced input circuits. The balanced input cir- 
cuit used is a recent development. Its pur- 
pose is two-fold: (1) to allow simultaneous 
operation of several amplifier channels on a 
single subject, and (2) to permit interference- 
free operation when the subject is not shielded 
from induction caused by power lines. (This 
is indispensable in field work). 


Workers in this field have had considerable 
trouble with input circuits, i.e. from subject 
to first stage, due to the large possible pick-up 
of the subject’s body, contact potentials and 
variations at the electrodes on the subject, 
long leads to the grids of the amplifiers, and 
the general difficulty of finding the proper 
place to ground when such very low voltages 
are to be amplified with low input resistance. 
The balanced input arrangement used in the 
low-frequency amplifiers, with the whole 
input 9 megs up from ground, is a convenient 
way of avoiding these difficulties. The input 
network simply floats about on top of the o 
megs with slight D.C. changes and only the 
A.C. is amplified. 
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FIGURE 2 
if GENERAL SCHEMATIC DIAGRAM SHOWING THE 
q INTERRELATIONSHIPS OF THE EQUIPMENT 


ee See EeETne 


Se ow 


q 
if 








is 


~“ 


~ 
“~~ 
~ 
~ 
_ 
< 
i 
~ 
~ 
~ 
~ 
— 
~ 
~ 
— 
NN 
~ 
~ 
nN 
~ 
~ 
< 
a 
~ 
zs) 
NN 
, 
~ 
Q 





z aMnoly 


SLINN G3NIGNOD ONIMOHS WVeOVIC DILVW3HOS WWHSNa9 





NOILISOd 

































































vuanv> 
ssiz woLow 
a) SMONOWHONAS 
A Oli 
Hdv¥SOTIIDSO} 
AV J0OHLV> 
eS! SdAL 
iN A Ol 
= m acon a 
| LINDUID 
yOwWd d33Ms 
waMod ¥wv3aNIn 
TT T 
AOll ‘AON 
¥O.LOW 
SNONOBHONAS — 
Ao il | 
@ ¥OlvNONN Vv YOLYINONN 
T | 
os | | wan CO > 
HOLIMS see HD LIMS 
IVANINd DLVAWNING 


ao 





21 ON 
$36N1 wos 











i 


bOVd HIMOd| 


3NID SWVIL BOs BOLVEINID TYNOIS OL 


$ 




















+— 
36ni AVY 'H ane 
2OOHLYD A —, 3AN4 
ry 
36NL Ava H 3Ime 


ZOOHIVD A HON! BAIS 
’ 














36. Ave 


300HLv> HON! N3A3S 
xO@ HDLIMS ] 


| N334u9 








A ont’) Ave 
300Hiv> L- 











3OviS VULXa 


: (om 








SyalsNdny 
uamMod 





' 
J 


oon weer ee-4 


12 NOIWISOd ; 
| Wweanvs 


Se 


' 
' 
coed 


AINdN! YvV3IN ONNOYD NOWNOD 


Q3073!IHS SLINN N33M138 ONIBIM TIv 





dl EB 


3OVLS VHLX3 








AN3LLIVE 








| 
i 


ANdNI LIA GLS 








LNdNi DOA OL 
e 


HE] (J 








Sui 4d 
AININDB4 MOT 


t] - 





J 


) 


3.LON 


Lo3rens 
WO8s INdN! 





+ 


| ag] yOoLw77I2S0 


@Li€ 3dAi 
ASE! | Oldve WwHINI9D 














JOURNAL OF EXPERIMENTAL EDUCATION [Vol. 6, N 








FIGURE 3 
CORNER OF SHIELDED ROOM 


This photograph shows the two matched battery oper- 
ated low-frequency amplifiers and the General Radio Type 
377-B_ Low-Frequency Oscillator used for calibration 


ee) 


| purposes. 











q 


I 
: 
/ 








38 ELECTROENCEPHALOGRAPH } 247 








FIGURE 3 
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FIGURE 4 
FRONT VIEW OF AMPLIFIERS 
The two upper units on the left are the matched low- 
frequency amplifiers which are completely battery operated. 
The lower unit on the left is the extra stage which is 
either A.C. or D.C. operated. 


The matched power amplifiers are shown on the right. 
(Only one dynamic speaker is shown.) 
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FIGURE 4 
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FIGURE 5 
Back VIEW OF AMPLIFIERS 

The matched power amplifiers are on the left. 

The matched low-frequency amplifiers are on the right. 

In the lower right-hand corner the extra stage which is 
A.C. or D.C. operated is shown. This amplifier appears 
with its shield in place. All other shields are removed to 
reveal details of construction. 
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FIGURE 5 
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FIGURE 6 
WIRING DIAGRAM OF MAIN Low-FREQUENCY 
AMPLIFIERS 
All tubes are especially selected type 6—C—6. 
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FIGURE 7 
WIRING DIAGRAM OF SINGLE STAGE PUSH- 
PULL AMPLIFIERS 


The output here is suitable for operation of the cathode 
ray oscillographs and the power amplifiers. 
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Most of this research is done without the 
use of electrical filters* but for certain spe- 
cial purposes three different filters (figure 8) 
been built The R.C. low-pass filter 
shunt capacitors separated by a series 
The capacitors are specially se- 


have 
uses 
resistance 
lected so that their impedance is approxi- 
mately equal to the impedance of the output. 
lhe series resistors are of the same order of 
magnitude as the output impedance. This 
filter can be adjusted so that it passes up to 
12 or 15 cycles without serious attenuation, 
and rapidly eliminates Irequenc ies above this. 
lhe second filter is an anti-resonant 40-cycle 
section consisting of a special inductor and a 
Radio 219-N two dial Decade Con- 
cycle, high-pass filter works 
in and out of and is terminated 
on the output end in 4 ohms. It con- 
sists of a special General Radio Type 830-D 

cycle High-Pass filter. The use of each 
of these filters requires a change to 1o mfd. 
coupling condensers in the output network 
of the amplifiers. This and other needed 
changes are affected by a  double-throw 
double-pole switch on the front panel. 

he electrodes used in the Wisconsin lab- 
oratory are of two types. One type consists 
of a silver-silver chloride wire inserted into a 
glass T-tube filled with 10% saline solution, 
making contact with the skin by means of a 
cotton plug, and held in place by an elastic 


General 
denset The 
ohms, 
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electrode and held in place \ 


collodion. 


paste, 


Figures (4, 5, 9, and 11) show the power 
amplifiers used to operate the loud-speakers 
and the ink recorders. 

The amplifier system used in this program 
of research records the electrical activity of 
the nervous system by means of: (1) ink- 
writing recorders, and (2) cathode ray oscil- 
Loud-speakers may also be con- 
the amplifier outputs for 


lographs. 
nected to 
checkup. 


oral 


The power amplifiers, ink-writing record- 
ers, timing units, and the calibrating and re- 
cording oscillographs are all located outside 
of the shielded room. (figures 9, 19, and 20) 
Figure 9 shows the power amplifiers, loud 
speakers, ink recorders and cathode ray as- 
sembly utilized for a continuous visual check 
on the functioning of the apparatus. 

The ink recorders (figure 10) are powered 
with a single synchronous motor and a gear 
mechanism so that they are in step with each 
other, and their speed is electrically locked 
with that of the recording camera. A range 
of six speeds can be secured by means of the 
gear-shift mechanism which was specially de- 
signed and built for this unit. The sensi- 
tivity of the instruments was increased better 
than 200% by designing and making new 
tension springs for the moving elements. By 


band. The other consists of a small silver means of a sensitive pneumatic switch the 
spiral making contact with the skin through main recording camera can be started and 
* The electrical filters developed for this research followed stopped at any time by the experimenter who 
t I i M 3 hic f the Ger ‘ ; M : ° ° 
Radio Comoe is observing the output of the ink recorders. 
FIGURE 8 


ELECTRICAL 


FILTERS 


Che unit on the left contains the 200-cycle high-pass 
filter and the R.C. low-pass filter. 
The three units on the right in functional relationship 


with the amplifiers constitute the 


section 


anti-resonant 40-cycle 
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FIGURE 9 
Room ADJOINING THE SHIELDED Room 


Matched power amplifiers. 

Ink-writing recorders. 

Electrical signaling device. 

Interference detector. 

General Radio Type 528 Cathode-Ray Oscillograph 
Assembly used for continuous visual check on perform- 
ance of equipment and experimental phenomena. 

Sweep circuit for cathode ray oscillograph. 

Special switching unit. 
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FIGURE 9 
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FIGURE I0 


CLosEUP OF INK-WRITING RECORDERS AND 
ACCESSORY EQUIPMENT 


1. Gear shift mechanism. 
Synchronous motor. 
Rheostats 
liming unit. 
Switch operated pneumatically from shielded room. 
Electrical signaling units. 


[hese recorders are locked electrically in speed with 
the camera by means of duplicate synchronous motors. 
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FIGURE II 
WIRING DIAGRAM OF POWER AMPLIFIERS 


The output here is to the ink-writing recorders. 
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(he recording camera (figures 15, 16, 17, 
18. and 19) is so designed and built that the 
range of speeds of the film past the lens ex- 
tends from 1 inch per second to 30 feet per 
lhis range is secured by the use of 
interchangeable gears and motors At all 
lower speeds it is locked electrically with the 
ink recorders thus making the records directly 
The recording camera is 
a special ground-glass 


second 


compat ible 
equipped with locus- 
ing arrangement and uses three interchange- 
able lenses (Carl Zeis Biotar 50 mm. F 1:4, 
lavlor-Hobson Cooke Panchro Anastigmat 
1o8 mm. | 5, and Taylor-Hobson Cooke 
Kinic Anastigmat 6 Special 
screws afford extremely focusing, 
and a side adjustment coupled with a revers- 
mechanism makes it possible to have as 
many as four records side by side on one 
strip mm. film. <A reversing switch 
permits the film to be run backwards without 
rewinding. <A series of special lens mounting 
rings make possible a wide variety of image 
sizes ranging from a magnification of 1/14 to 
Che magazine chamber will hold up 
feet of film. The driving mechanism 


inch F 3:5). 
accurate 


Ing 


of 35 


> times 
to! 


is geared and the film speed past the lens is 


constant no matter in which direction the film 
is traveling Chruout, inch metal light 
seals are employed. The amount of film used 
and the amount remaining in the magazine 
can be accurately read from a counter which 
is geared to the driving mechanism of the 
camera. 

Much experimenting with different film 
emulsions and different photographing, sensi- 
tizing and developing procedures have re- 
sulted in a technique which makes possible 
the securing of excellent records which can be 
zinc etching 


reproduced by means of the 
any kind. 


without retouching of 
13, and 14) 


process 
(figures 12 

lhe camera is used to photograph the mov- 
ing spot on the cathode ray screen. (figures 
16, and 19) The main oscillograph is a Du 
Mont Type 158 specially adapted for use 
with a five-inch Du Mont tube with a blue 
screen and a very rapid decay period. For 
time lines a pair of Du Mont Type 164 oscil- 
lographs with 3-inch blue tubes are used. 
(One of these with the prism arrangement 
used for optical reasons is shown in figure 
19). The time line is generated by a Gen- 
eral Radio Type 377-B Low-Frequency Oscil- 
lator. For photographing simultaneous am- 
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plifications of the electrical activity of 
cortex two specially matched cathode 
oscillographs with five-inch blue tubes 
used. (figure 20) 

All of the recording equipment shown in 
figure 19 is kept in a light tight room and 
is operated from the outside. This insures a 
maximum of brilliancy on the cathode ray 
screen and eliminates interference from ex- 
traneous light sources. (The experimenter 
outside of the shielded room has a constant 
visual check by means of the General Radio 
Type 528 Oscillograph Assembly which is i 
parallel with the recording oscillographs). In 
figures 16 and 19 a black slotted screen over 
the end of the cathode ray tube is shown. 
This serves to cut off all light inside the tube 
except that generated by the spot, and 
greatly increases the sharpness of the line 
traced on the photographic film. This screen 
is used on all tubes during photographi 
recording. 

Work with the electrical activity of the 
brain requires very delicate and_ sensitive 
equipment and careful recording techniques 
Artifacts are numerous and sometimes ex- 
tremely difficult to detect. The apparatus 
used in the Wisconsin laboratory has been 
constructed of the finest materials available 
at the present time, and no pains have been 
spared to insure dependable performance. In 
addition to this care in the design and con- 
struction of the equipment, all of the appa- 
ratus, with the exception of the recording 
camera, has been constructed in duplicate 
with the result that the faithfulness of the 
operation of each unit may be compared at 
any time with that of a carefully matched 
unit. This enables the experimenter to check 
upon the reliability of any part of the equip- 
ment at any time. In the actual collection of 
data, leads from the same pair of electrodes 
are run to completely independent but 
matched amplifiers, ink recorders, and cath- 
ode ray oscillographs. If the records secured 
under these circumstances are not identical, 
the data are not used and the trouble is in- 
vestigated. Actually, in much of our research 
this duplicate record is secured thruout the 
experimental period. When the conditions of 
the experiment render a continual check im- 
possible this part of the calibration is run be- 
fore the experimentation proper, in the mid- 
dle of the experimentation, and upon com- 
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FIGURE 12 
This figure is a full size halftone reproduction of a com- 
pletely unretouched sample record secured with the equip- 
ment described in this paper. The upper and lower records 
are 50 cycle time lines. 
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FIGURE 13 








This figure is the same full size unretouched record 
given above, but in this case it has been reproduced by 


ata 
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FIGURE 14 





is 


This figure shows a portion of the above record enlarged 
to the size visible on the cathode ray tube before photo- 
graphing. 
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pletion of the experimentation. The bulk of — cortex 


the work for the past four years has involved 


the design, construction, installation, and per- experimental and evaluative procedures. 
fection of equipment, the securing of devel- sequent papers will deal with evaluative 
mental norms for electrical activity of the cedures and the studies of specific probl 
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FIGURE 15 


under varied 


FULL VIEW OF CAMERA 


Camera proper. For details see 
graphs 
Special fittings for auxiliary lenses. 


subsequent 


Extra motor unit for altering range of speeds 


Rear fecusing control. 


{TION 


simple 


phi Ito- 


One of 4 jacks for leveling, raising, and steadying. 
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FIGURE I5 
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FIGURE 16 


(‘LOSEUP OF CAMERA AND OSCILLOGRAPH 


Special viewing and aligning unit. 
Ground-glass focusing device. 
Mechanism for driving film. 
4. Du Mont Type 158 Oscillograph especially adapted 
for use with a 5-inch Du Mont cathode ray tube. 


Che special light shield with its narrow slot is shown 


in place in front of the cathode ray tube. 
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FIGURE 17 
( LOSEUP OF CAMERA 


Front focusing control. 

Horizontal adjustment mechanism. 
Reversing switch. 

Film footage counter. 
Interchangeable gears. 
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FIGURE 17 
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FIGURE 18 
CAMERA ACCESSORIES 


The front row shows five of the special lens mounting 


rings for changing the image size on the film. 


Prisms used in aligning cathode-ray tubes for 
photographing 

spec ial test shot camera. A Taylor Hobson Cooke 
Panchro Anastigmat 4'4 inch F 2:5 lens is shown. 
laylor—Hobson Cooke Kinic Anastigmat 6!'2 inch F 3:5 
lens. 

Carl Zeiss Biotar 50 mm. F 1:4 lens. 

\ttachment employed when camera is used outside of 
dark room. 

Motor for special speed range. 

One of a set of auxiliary gears for changing speed range 
of the camera. 


| Vol. 


ft) 









































ELECTROENCEPHALOGRAPH |) 











5 


5 


JOURNAL OF EXPERIMENTAL EDUCATION 


FIGURE 19 
(CAMERA, OSCILLOGRAPHS AND OSCILLATOR 

Camera previously described. 
Du Mont Type 164 oscillograph used for time line. 
Prism arrangement for photographing spot on 3-inch 
cathode ray tube. 
Main recording oscillograph. (Du Mont Type 158) 
General Radio Type 377-B Low-Frequency Oscillator 
used for generating the time line. 
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FIGURE 20 
MATCHED CATHODE RAY OSCILLOGRAPHS 
1 and 2. Five-inch blue cathode ray tubes. 
3. Power supply. 
; and 5. Front silvered mirrors for bringing the spot move- 
ments together on the 35 mm. film. 
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THE EFFECT OF WEIGHTS ON CERTAIN iNDEX NUMBERS 


DoucLas E 
Bureau of Re 


eived a substantial 
On a number 
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field of education.’ 


if isions they have been employed in the 

rating of educational activities in the various 
tate he pioneer, and perhaps best 
W iis field was done by Ayres, 

1 1 Phillips followed with 


g lifications in 1924 and again in 
193 Schrammel presented another index 


ind, with Sonnenberg, later 
brought the calculations up to 1934. Their 
work was revised by Scates in 1937.° The 
Res Div n of the National Education 
\ssociation set rth five elements of school 
eft 


| of these index num- 
lem of weighing has been pres- 
\) left the impression that his series 


were unweighted. It is true that he did not 


ipply special weights to them; yet the mathe- 
atical functions of the data which he took 
iffected the relative variability of the differ- 

t series, or traits, and hence affected their 
weight | lips, and Schrammel and Son- 

om srded lve the problem by 
t ( ttempted to soive the problem Dy 


which make all series (traits) of 
equal weight. But equal weighting may be 
no better thar nat- 
ral weighting (the relative variability in the 
are observed). The problem of 
index numbers of this type is 
nescapable; and to resort to ranking, or other 
forms equal weighting, is probably more 
arbitrary than to select a set of weights that 
is judged to be reasonable. 

There has, however, been a general indis- 
position to face the problem of weighting 


ing ranks 
perhaps not as good as 


data as they 


weighting in 


directly. Weighting appears to have been re- 
garded as a matter of great danger: no one of 
those who have worked in this field has shown 


iny willingness to venture a set of reasonable 
weights for the problem on which he worked. 
The Research Division of the National Edu- 
cation Association,® realizing that natural 
weighting and equal weighting were not ulti- 
mate solutions for the problem, refrained from 
making any combination of the five traits 
they set forth rather than to assume the risk 
of using weights which might be in error. 


Ne 
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earch, Cincinnati Public Schools 


A review of the work in this field leads 
to raise the question, How important 
weighting in this type of index number 
it likely to be as important as the selecti 
the original traits? Is it likely to cause n 
error than the unreliability in the basic d 
Is it any more important than the seve: 
other matters on which judgment must 
exercised in the preparation of index 1 
bers? Such questions are too broad t 
answered by any single study; they will be: 
much research. Studies of the validity 
reliability of the basic data must be far rea 
ing. It is possible, however, to make a pre- 
liminary attack on the problem by ascertai: 
ing just how much effect different weights 
likely to have, and judging whether the deg: 
of variation produced is unacceptable. 

The present study deals with the app! 
tion of various sets of weights to the different 
traits or factors (sometimes spoken 
criteria) which have been used in four p 
lished studies, three of which were ratings o! 
school systems and the fourth on cost of li 
ing. The purpose of this study is to ascerta 
the effects of different sets of weights on the 
resulting index numbers under normal wor! 
ing conditions. For that reason data fr 
published studies rather than theoret 
data were used to experiment with.  T! 
question is after all a practical one more th 
a theoretical one. 

The studies from which data were taker 
and the technique of experimentation, will be 
made clear in connection with the descriptio: 
of each experiment. 


I. NATIONAL EpUCATION ASSOCIATION DATA 
ON STATES (ACTUAL VALUES) 


The first experiment was performed on the 
five series of data which the Research Divi- 
sion of the National Education Association 
presented in 1932 to represent five different 
aspects of efficiency of state educational activ 
ities®. These traits are as follows: 


1. The proportion of children who are in 


school. 
2. The holding power of the schools. 
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quality of teaching provided (in- 
lexed by salaries paid teachers). 
[he school environment (indexed by the 
of school property per pupil). 
he per cent of literacy. 


each trait are 


specific definitions of 
original source. 

estions with which the present study 

erned are: What would have hap- 

ned if weights had been assigned to these 

the traits had been combined into 

1dex number? If the weights used had 

een exactly correct, how much error 

would have resulted? In general, 

: risk is involved in the assignment 

ibly forceful weights to such a set of 


hts Used for Experimentation 
lo seek an answer for these questions, an 
itrary set of weights was selected. The 
most obvious weights to use on five series 
| probably be 1, 2, 3, 4, 5, or some mul- 
ples of these numbers. It was decided how- 
a more forceful set should be used, 
rder to subject the effects of weighting to 
re searching test. A set of weights hav- 
} maximum ratio of 11 to 1 was therefore 


ected. It was believed that this would rep- 

esent as great a ratio as would likely be used 

| the majority of practical situations. That 

in building up an index number for rating 


urposes, one will select and use traits which 
he regards as important; very minor, insig- 
fi traits, will not be included. In most 
ises, traits which are judged to have a value 

f less than 1/1o or 1/11 the value of some 
ther trait will probably be omitted from the 
index number. A set of weights having a 
maximum ratio of 11:1 was therefore thought 
to be satisfactory for experimental purposes. 
The intervening three weights were put at 
5, 6, and 7. 

The result was a set of five weights, two of 
which differ substantially,’ and three of which 
liffer only slightly. When applied, this set of 
weights would be interpreted to mean that 
three of the five traits weighted by it are 
judged to be of about equal importance, 
though differing somewhat among themselves; 
that one trait is judged to be about twice as 
important as the median one, and another 
trait about one-sixth as important as the 
median one. The results of this first experi- 
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ment should hold for any set of weights hav- 
ing roughly these characteristics. 

Each weight of course applies only to one 
series, or trait, for any given series of index 
numbers. The effect which a weight has is 
largely dependent on the series to which it 
applies." To use the weights in any one or- 
der for the five traits would therefore afford 
only a partial test of the effects that those 
weights might have on an index number—for 
it might happen that, if the same weights were 
applied to other traits than those to which 
they were first assigned, they would show 
much greater influence on the resulting series 
of index numbers. The weights must there- 
fore be tried out in different arrangements, or 
patterns, with reference to the five traits. 

To make a complete test of the effect of a 
set of weights on a given number of series, or 
traits, would require that the weights be used 
on the series in every possible arrangement. 
That is, with the series kept in a given order, 
the weights would be applied 1, 5, 6, 7, 11 
, S. & 38, 9: 2..& 3% & 42: 6. SS, 86. & 
and so on, through all of the possible permu- 
tations. This however would result in 120 
different arrangements of the five weights. 
Most of these arrangements would differ so 
slightly from each other that it did not seem 
important to work out such a complete test 
Instead, a small sample of these arrangements 
was used. The five weights were rotated, by 
moving them along one trait at a time. That 
is, a series of index numbers were calculated 
with the weights in one position; then the 
weights were moved along one trait, and an- 
other series of index numbers were calculated, 
and soon. The different arrangements of the 
weights that were used on the five traits are 
shown in Table I. 

It is recognized that this rotation does not 
represent a perfect sampling of the 120 pos- 
sible arrangements; but objections were found 
to every sample considered, and the rotation 
method was adopted as yielding a fairly satis- 
factory indication of what the weights might 
cause. This set of patterns at least represents 
violent changes, for the series which receives 
the lowest weight one time receives the high- 
est weight the next time. 


Procedure 

The steps of the experiment are largely ap- 
parent from the foregoing discussion of the 
use of the weights. First, each series was 
divided through by its standard deviation in 
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order to make all of the traits of equal weight. 
4 series of index numbers for the forty-eight 
states was calculated for this equal weight- 


TABLE I 


ARRANGEMENTS OF WEIGHTS USED ON Na- 
TIONAL EDUCATION ASSOCIATION DATA ON 
STATE EDUCATIONAL ACTIVITIES TO PRODUCE 
EXPERIMENTAL INDEX NUMBERS 

Designation Weights Used on the Five Series 

of Weighting 


Patter: I II Ill IV V 
Natural 2.4 18 99 22 1 
Equal (0) 1 1 1 1 1 
A 1] 5 6 i 11 
B 1] l 5 6 7 
{ 7 11 l 5 6 
D 6 7 11 1 5 
E 5 6 7 11 1 


Natural weights are those which the series 
have, as observed. The standard deviation of 
each series is divided by the smallest standard 
deviation, thus expressing the relative vari- 
ability of the five series as a set of ratios, the 
smallest being unity. 

Equal weights (designated by 0) were ob- 
tained by dividing the values in each observed 
series (naturally weighted) by the standard 
deviation of that series, thus reducing its 
variability to 1 S.D. 

All weights shown in patterns A-E were 
applied to the series after the series had been 
reduced to equal weighting. 


ing as a basis for certain comparisons. Then 
the set of five weights was applied to the five 
series, and a second series of index numbers 
was calculated for the forty-eight states. 
Then the weights were shifted one trait, and a 
third set of calculations was made. This was 
continued for each of the six special weighting 
patterns shown in Table I. 

rhe calculation of a series of index numbers 
involved simply the summation of the values 
for each state given by the five traits, after 
each trait had been weighted (multiplied) as 
described. The result was a series of forty- 
eight sums, each representing an index num- 
ber. While index numbers are usually ex- 
pressed as ratios—calling for the division of 
the series by some selected base—it was un- 
necessary for the present purpose to make 
such a division, since dividing the series by a 
constant does not affect its correlation with 
any criterion. The series of forty-eight sums 
was therefore used directly as a series of in- 
dex numbers. A technical discussion of the 
formula underlying this procedure is given at 
the end of this paper. 
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In order to facilitate certain comparisons, 
these series of sums (or index numbers) wer 
converted into ranks. This step afforded a 
fairly significant unit for measuring displace- 
ment. The product-moment correlations 
which are recorded, however, were based 
directly on the actual values of the sums and 
not on the ranks of these sums. 


Index Number Ranks for the States 


The ranks of the states on the variously 
weighted index numbers are shown in Table 
II. So far as is known, this is the first time 
index numbers for the states have been calcu- 
lated from these data. Certainly it is the first 
time that index numbers have appeared with 
these particular weights. As _ previously 
stated, the National Education Association 
refrained from combining the five traits be- 
cause of a hesitancy in assigning specifi 
weights to them. 

The data in Table II, although reported 
here for experimental purposes, may be added 
to the literature on state index numbers, along 
with those reported previously by Ayres, Phil- 
lips, Schrammel and Sonnenberg, etc. On¢ 
may take his choice between the seven series 
of index numbers shown in this table, accord- 
ing to his judgment as to the best set of 
weights. The series appearing under the 
heading, “Equal Weights,” is most compar- 
able to the index numbers previously reported 
by others, who have consistently used some 
form of equal weighting. 

This particular series of index numbers 
(equal weighting) correlates with those re- 
ported recently in Scates’ revision’ of the 
Schrammel and Sonnenberg numbers to the 
extent of .928. The two sets of data are for 
approximately the same date, though the 
N. E. A. data represent a three- or four-year 
earlier status than do most of the traits in the 
other index number. 


Discussion of Rank of States 

From the data given in Table II, one may 
inspect the results of the various weightings, 
and form a preliminary conclusion as to the 
effects which these have. To some, this in- 
spection may offer a more concrete and satis- 
fying basis for conclusions than any of the 
analyses which follow. 

In the first data column, ranks are given for 
an index number calculated directly from the 
five traits as they were originally given— 
without any change in the observed (natural) 
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TABLE II 


STATES ON INDEX NUMBERS DERIVED 
WHEN THESE TRAITS ARE 
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_ The code for the weighting patterns applied to the component traits is given in Table I. 
When the five component traits are ranked before weighting and combining, the results are 


those shown in Table VIII. 
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ti While these naturally weighted ability is left intact, it is interesting 
ts are not a part of this experiment, they serve from Table II that this weighting ¢ 
nevertheless a source of some interest. It index number values that are ther 
noted fr lable I that the series insistently close to those in the other colu 
rt ffer greatly among themselves in Arizona and Maryland are the principa 
ht that trait III ha 1 variability ceptions; New York, California, and other 
ht) nearly one hundred times as great are examples. The correlation between 
t \ it would therefore be expected naturally weighted and the equall; 
the traits w combined in this index numbers runs .g16. In thi irt 
veighting), trait III case then, weights having a rati 
would dominate the resulting index numbers. do not produce an effect that differ 
tate would be placed in the from uniform weights. 
¢ ImDeé pretty much accord- \ more detailed analysis of the 
relative position in trait III. This shown in Table II is given later 
tually « nd, because of the extreme 
the elat vet wee trait III Resulting Coe ffictents of Intercor 
the ind imber for all traits, with their In Table III are shown the corre! 
il weighting § tween the various columns of Tal 
fo calculate in ers with such an the left half of the table the index : 
me weighting d scarcely be done, for based on equally weighted components 
e would be little use in including all of the hown correlated in turn with each of the f 
trait One would stake his complete index numbers derived from experimenta 
nce uy trait III instead. Or, since weighted series. The two columns of corre- 
iit IV carries a natural weight 14 as great lation coefficients represent simply two differ 
s trait III, one might include it also. But ent ways of calculating the correlations. 1 
ttle would be gained by adding in traits I, product-moment correlations represent cak 
II, and V, which have a combined weight of lations based directly on the actual values 
which is only 1/23 the combined weight the index numbers (without grouping int 
of traits III and IV. class intervals). The rank correlations rey 


In spite, however, of the extreme weighting resent calculations based on the index num- 
f traits III and IV when the natural vari- bers expressed as series of ranks (as shown i 


TABLE III 


INTERCORRELATIONS AMO INDEX NUMBERS DERIVED FROM DIFFERENT WEIGHTINGS OF FIVE 
TRAITS. N.E.A. DATA 


ations between index numbers based on Correlations between index numbers based 


equally weighted traits (0) and index numbers’ various pairings of specially weighted traits 
based on specially weighted traits (A-E) (A-E) 


Product- Product- 

Weichting Moment 2ank Weighting Moment Rank 
Patterns Correlation Correlation Patterns Correlation Correlatior 

O.A 995 .992 I cathe chit ciate tines .988 987 

O,D 994 989 OO) Se .986 .983 

O,F ; 990 993 i ae .981 .962 

O.B 986 O75 Oo (Sa Rented 978 987 

0.4 976 .952 67 = eo .976 .979 

- a eS ee .973 .960 

Meat 988 .980 Se ee 972 .949 

Sf errr .963 .943 

of Sa 949 .930 

a 942 886 

a a a .971 .957 


Che letters preceding each correlation coefficient indicate the pattern of weights (as given 

n Table I) used to produce the two series of index numbers correlated. 
The product-moment correlations are based on index numbers taken at their actual values 
d the rank correlations are based on the ranks of index numbers. In both cases the five 
series entering into the index numbers were taken as weighted actual values, and not as ranks. 
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II). The two methods of calculation 
utually corroboratory. The right half 
able III presents the results of similar 

relations among the index numbers based 
cially weighted components. 

correlation coefficients in the leit h 

III throw light on the question, How 
fect will weights have on index num- 
contrasted with equal weighting? In 

how much “safety” can a worker 
resorting arbitrarily and mechanically 
weighting, and how much danger oi 
tive error is there in departing from 
routine and attempting to exercise 
gment in assigning weights? 
answer to such questions is that, under 
this experiment, it makes 
‘here is little 


ras, 


nditions of 


fference which is done. 


rising from the element of subjectiv- 
tired for assigning special weights in 
s case. and, likewise, it may be said that 


is gained in doing so, as compared with 
ng the weighting equal. Three of the 
relations are nearly perfect, when the con- 
ns are favorable; when the heavy 
ghts happen to fall upon the series which 
re most (divergent) from the other 
ts, the correlation coefficient drops as low 
s—which still is higher than the validity 
probably would be claimed for the set 
traits, and is probably much higher than 
ibility of the basic data. 

Che right half of Table III throws light on 
question, If special weights are 
signed, how much danger is involved that the 
zhts may not be properly placed? that is, 

iat the heavy weights would be assigned to 
series, the light weights to the 
series, etc.? The answer of correla- 

n to these questions is about the same as in 

e first case; even if weights are assigned in 
the worst possible way, the correlation be- 
tween these results and the results of the best 
ossible assignment of weights is reasonably 
igh. Of course we cannot tell from the pres- 
nt data what is the best assignment and what 
; the worst assignment; but we may look at 
the lowest correlation in the table, and say 
that that represents the greatest difference 
ssible® in the assignment of weights, which 

s the difference between the best possible as- 


most 


unique 


e rell 


to be as- 


wrong 


\s a matter of practical conclusion, one will 
robably concede that there is no @ priori rea- 
son to feel that one would make the worst 
ble assignment of weights. Anyone fa- 
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miliar with the field 


would probably make a pretty g issign 


a | g g 
ment of the weights. If such be granted. th 
extent by which his index number 

1 ea! weighting 1s represent 

ne other coe clients » Table II] 

let us say, by the average 7 


B e@ ieavi the evidence y { 
correlation coefficients, we may tentio 
to the low correlation of 94 (with rre- 
sponding rank correlation of .886) for the 
purpose of learning what ide it 
so low We note that th Value 1 for 
the correlati pet wee the 1 ber 
weighted by patterns B and ¢ I ttern B 
(accordi to Table I), trait Il the 
lightest weig] nd in pattern C trait IT re 
celve the heaviest weight VW fi tnere 
fore that trait II is relatively uniqu s con 
pared with the other four traits, a that a 
change in weighting in the ratio of rr to 1 fe 
that trait is sufficient, when combined with 
lesser changes in the weights of other trait 
to cause a definite disturbance in the resulting 


index number series 


We may check this inference in several 
Ways. We note, from Table ITI, that the cor 
relation in which the C weighting 
the lowest in the table on the left and 
on the right side of the same table the four 
correlations in w the C weighting occurs 
are the lowest. Evidently the C weighting, 
with its emphasis upon trait II and its slight 
emphasis on trait III, is the most disturbing 


(effective) of all the five patterns 


We may however secure more definite evi- 
It was previously pointed out® that 
the effect of weights was not dependent alone 
upon the value of the weights, but was condi- 
tioned also by the uniqueness of the particu- 
lar series receiving the weight. Probably the 
best measure of the uniqueness of a trait is 
its correlation with the sum of the remaining 
traits. If the weights are to be applied to 
equally weighted series to form index num- 
bers (or other composites), as in the present 
study, then these correlations for determining 
the uniqueness of each trait should be based 
on sums formed from equally weighted traits. 


curs is 


side, 


ler > 
aence. 


The correlations between each of the five 
traits presented by the N. E. A. and the sum 
of the remaining four traits (equally 
weighted), are as follows:? 








88 IOURNAI 
Trait I and sum of other four: .886 
* I] . = : sa .570 
Ht “ “ « «7931 
Vi“ “© # # 4 (965 
. * . . - - 834 
Average ntercorrelation between 
all five traits: .732 
These correlation coefficients, taken to- 


gether with the weighting patterns, afford a 
final explanation of the low correlation me Sx 
Trait II is the most unique of all the five 
traits—distinctly so. When this trait receives 
the least weight (pattern B) and then receives 
the heaviest weight (pattern C), the correla- 
tion between the resulting two series of index 
numbers drops to .94. In fact, giving the 
heaviest weighting to trait II (which pattern 
C does) seems to cause all of the correlations 
hich pattern C enters to be low 


h 
nici } 


in W 


| mriat > mn Le 1ding Si 14é5 


Another 


etiect ol! 


f evidence concerning the 
weights the displacement 
which they cause among the top group of 
states —say the first five. Any group of 
states would serve for illustration, but the top 
group is selected on account of the large inter- 
est that is likely to center in this group. 

lable IV shows the top five states, in order, 
under each of the weighting patterns, includ- 
ing natural weighting. Thirty out of the 
thirty-five places in the table, or 86 per cent, 
ire filled by the five states which appear in 
the equal weighting column. In other words, 
we may say that the listing is 86 per cent con- 

Five places in the seven columns— 
per cent—are filled by “stray” states that 
are placed there by the idiosyncrasy of some 
particular weighting. Pattern C, already ob- 
served to be peculiar, accounts for two of 
these five. Outside of the natural weighting 
and the C pattern of weights, there are only 
two states in any of the five remaining lists 
that are not in all of the five lists. 


form 0 


is 


the 


sistent 


I4 per 
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California and New York occupy the top 
two positions consistently in all of the « 
umns—except for the freakish C weighting 
which places greatest emphasis upon a trait 
that has a great deal that is not in common 
with the remaining four traits in the index 
number. 


Average Rank Displacement 

A third type of evidence concerning th 
effect of the weights is shown in Table V 
This table is an interpretation of the correla- 
tion coefficients of Table III. It shows, for 
each of the pairings of differently weighted 
index numbers, the differences between the 
rank positions of the states in the two inde: 
numbers. That is, for the two index number: 
consisting respectively of traits weighted by 
pattern O and by pattern A, the ranks of ths 
forty-eight states in these two index numbers 
differ on the average by 1.2 positions, the 
maximum difference in the ranks of any stat 
being four positions. Other rows of the table 
are read in the same fashion. 

The figures in the “Average Difference’ 
column represent a directly derived form 
mean error of estimate. The values conform 
closely to those calculated by the usual form- 
ula. They may, therefore, be read in the 
usual sense; that is, knowing the rank of a 
state in the index number series based on 
equal weighting, we may estimate its rank in 
the index number series based on any of the 
five special weightings, or vice versa, with an 
average error of 1.8 ranks. This amount of 
discrepancy is a measure of the influence of 
any special pattern of weighting as contrasted 
with equal weighting. 

The right side of Table V is interpreted in 
the same way with respect to differences 
among the five special weighting patterns. 


TABLE IV 


Top RANKING STATES ACCORDING To INDEX NUMBERS DERIVED 


OF FIVE TRAITS. 


Natural Equal 

Weights Weights (0) A 
1.N. Y Calif. Calif. 
2. Calif. N. Y. Ge 
S. Md. Mass. N. J. 
4. Mass. Ohio Nev. 
5. Conn. N. J. Mass. 


FROM DIFFERENT WEIGHTINGS 
N.E.A. DATA 


B C D E 
N. Y. Calif. Calif Calif. 
Calif. Utah mM. ¥: N. Y. 
N. J. Nev. Mass. N. J. 
Mass. Ohio Ohio Ohio 
Mich. Ms Es N. J. Mass. 


Patterns of weights are described in Table I. 














I 938 | 


INDEX NUMBERS 


TABLE V 


)IFFERENCES IN RANK POSITIONS OF THE FORTY-EIGHT STATES RESULTING FROM DIFFERENT 


WEIGHTINGS OF FIVE TRAITs. 


N.E.A. DATA 


Averages are for the group of forty-eight states 


Equal Weights (0) and Special 
Weights (A-E) 


Pairing of Corre- Aver. Min. Max. 
Index Nos. lation Diff. Diff. Diff. 
ee .995 1.2 0 4 
0—B . .986 2.3 0 8 
o—( a= wee 3.1 0 13 - 
l . .994 1.4 0 6 
0—] : . .990 1.1 0 4 

988 1.8 . 0 7 


Differences are in units of rank position. 


Special Weights (A-E) 


Pairing of Corre- Aver. Min. Max. 

Index Nos. lation Diff. Diff. Diff. 
. .981 2.8 0 12 
ra: 972 3.0 0 15 
a 988 1.6 0 6 
BHEe coccncssnsn ee 1.5 0 7 
TOS ns i cacesiaeaneslinein .942 4.8 0 19 
EEE 2.9 0 12 
B-E .. ——— 2.3 0 8 
SS . .963 3.4 0 13 
iS eee 3.8 0 17 
_ fs .986 2.0 0 6 
Mean _____-_ . 971 2.8 0 11 


Differences occur between pairs of index num 


values for each of the forty-eight states, when the component traits have been weighted 


dicated. 


nsistency of Ranking of Individual States 

[he fourth type of evidence of the effect of 
the weights is found in Table VI. Here the 

eraging is done perpendicularly to that rep- 
esented in the preceding table. Here we are, 
n effect, going back to Table II, and sum- 

irizing the changes that occur on each line. 
We have here not the average for the series of 
forty-eight states, but the average displace- 
ment for each individual state, caused by the 
index number weightings. This 
table may therefore be regarded as more ana- 
lytical than the preceding table, which dealt 
in generalizations for the entire group of 
tates as a whole. 

[It is interesting to note that some states— 
California and South Carolina, for example- 
show very little variation in rank positions. 
here are, in fact, eight states which show an 
iverage variation of less than one rank from 
me index number to another. This fact is 
brought about by these particular states hav- 
ing characteristics which gave them consistent 
positions on all of the five traits entering into 
the index numbers. Weighting therefore has 
little effect on their positions; even extreme 
weighting would produce little variation. 

Some states, on the other hand—notably, 
Idaho and Connecticut — show a significant 
shifting with a maximum of nearly twenty 
places, and an average of over seven places. 


tterent 


For weighting code, see Table I. 
Correlations are product-moment values from 


Table III. 

And this is something that should be borne in 
mind. While the average for the table is a 
shift of 2.5 ranks, and while the average inter- 
correlation of the index numbers for the dif- 
ferent weightings is about .98, there may be 
some individual cases that diverge from the 
general tendency and exhibit marked fluctua- 
tions. The cause of these particular responses 
to various weightings is an unusual lack of 
homogeneity in the development of the state 
educational programs, as measured by the five 
traits. But such cases are always possible in 
any correlation less than unity. One cannot 
tell from the correlation coefficient alone 
whether the observed degree of correlation 
indicates a uniform tendency of all the cases, 
or whether it is an average of a heterogeneous 
group—the result of most of the cases obeying 
a close relationship, with a few cases being 
very irregular and thus lowering the coeffi- 
cient. If the former is the case, then the 
effect of different weights will be uniform; if 
the latter is the case, the effect of weighting 
will not be uniform, and particular cases may 
occur (as in Table VI) where differences in 
weights produce violent changes in the rela- 
tive standing of those cases. 

It is of course true that changes such as 
those revealed by Table VI are more likely to 
occur near the middle of the group. In social 
affairs, this region is where there is likely to 
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be least concern about exact placement 

is worth bearing in mind, nevertheless 
the correlation coefficient is simply an 

and the tendencies lving behind th 
age may be uniform, or heterogene 
latter, injustice will be done in certain 
by the application of weights which 
justinable. 


It may be interesting in passing to not 
in Table VI there are seven states 
maximum difference of ten or more, 


other states having a maximum differ 


nine Refi ence to Table IT reveals 
the case of every one of these states 
differen¢ irs betwee 
ns b and ¢ Chese patte 
weight on trait II, which sh 
elat vith t remaining tr 


ous inspection of tables might lead 
vect—that 


| 
terns B and C 


Inalvsis aiso 


the index numbers bas 


disagree with the 
numbers in opposite directions. This 1 
seen from Table II by noting that 
one of the cases mentioned, the ra 
state for weightings B and C€ ar 
sides of the ranks given by equal we 


Hence, the results of weightings B 
1 lower correlation between themselves (1 
III) than they do with the results 
weightings or with the index numbers 
on any other special weighting. 


C onclusior trom 
The data presented in the vari 


lead to the following conclusions: 


Thi Ex pe riment 


Index numbers based on at least five 


which have an average intercorrelation 
about .73 are markedly stable under the i 
ence of a set of constant weights having 


maximum ratio of about ro:r. 

Series which are relatively uni 
by their correlation with the composite of t! 
remaining series, respond much more to tl! 
influence of weighting than do series whi 
correlate more highly with the remaining 
series. Extreme weights, therefore 
to an individualistic series produce greater d 
ferences than when these same weights 
applied to other series. 


11 


jUuc, 


7S shy W 


applied 


+4 


Not only does the effect of given weights 


vary with different series, but the effect 
within a series, according to whether a c 


state, in this study) exhibits average 


sistency in its rating on the various 


varies 
ase i 


originally measured. Cases which exhibit un- 


_s 
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heterogeneity in the chara 
vy the different traits will show greater 
the influence 


re- 


teristics 
erage displacement under 


s weightings. 


IONAL EDUCATION ASSOCIATION 


NAT 
Da ON STATES (RANK VALUES) 
Association 
ranks, as 


The two 


National Education pre- 
1 traits in terms of 
} 


terms of actual values 


their five 





data whicl are of course possible 
er actual values are given—raised cer+ 
estions Is it better to use tual \ 

the component traits for calculating in- 


ranks do is well? 
How 


with the differ- 


will just 


h difference is likely to result ? 


difference comnarée 
index numbers caused by changing the 


corre 





f the series? Does the rank 


technique give satisfactory values? Is 


. t 1 mre} ] 
eep-seatead preludice 


‘f some statisticians 
ranks supported by calculations of the 
t typer 
btain light on such questions the study 
preceding section was entirely 
e, using rank values instead of actual 
s, for the component traits. It must be 
n mind that this work was distinctly dif 
nt from that reported in the first section. 
ks which were shown or used in Tables 
VI were simply the ranks of the final re- 
ranks of index number values. In the 
1 (component 


d in the 


section, the five origina! 
were converted into ranks before being 
i Phe ‘ (index num 


are again ranked 


series Of SUMS 
the same as was done 
ertain purposes in the first section—but 
essential difference between the treatment 
lies in the the 


the two sections form of 


| data. 


Weighting Patterns of the Second Study 
lhe weighting patterns used for this second 
idy are essentially the same as those for the 
t study, shown in Table I, except that the 

veights 5, 6, 7, 11 were each one less. This 

a purposeful change; it came about 

ugh working on the project at different 

es, with slightly different notions as to 
hat was desirable. 

Che change in weights makes no appreci- 

le difference in the index numbers. Its 

may be measured by the correlation be- 

tween index numbers based on the two sets of 

veights: 1, 5, 6, 7, 11 and 1, 5, 6, To. 

order to test the maximum effect of the 


{ 
iS not 


( dal weighting | tte { was ( whicn 
pDiaces the maximu welgnt in DOLN cases up 
on the trait which is least in agreement with 
the remaining series Che two series of index 

} } 
numbe resulting these tw we nti 
patterns, show I ( elati 7 
there being a differs é the 
placement ot five states t of the f eignt 
With any of the other four spe weight- 
ing patter! we s} uld expect even greate! 
vreement 

] t tte thic ‘ 

e VII 
ABI I] 

W EIGHTIN PATTERNS USED ON THE RANK 
FORM OF N.E.A. DATA TO PRODUC! 
EXPERIMENTAL INDEX NUMBERS 

Weighting Weights of the I Tri 
Pa I I] III IV \ 
l l l l ] 
A ] 2 : 10 
B j 5 6 
( { l } 5 
) 5 10 4 
> { F 6 ) | 
Index Numbers from Ranked Components 


The six series of state index numbers result 
ing from the six different weighting patterns 
are presented in Table VIII. These values, 
like those of Table II, were produced for ex 
perimental purposes, but may be regarded as 
legitimate index numbers of the states. Those 
pattern O hich t] 
without special 


values for weighting which the 
ranked series are combined 
weighting) are comparable to other index 
numbers similarly constructed and previously 
published Some of the other series of this 
table may be superior to the equally weighted 
ones and may be employed. The choice is a 
matter of judgment. 

The correlation of the ranks in column 0 of 
Table VIII with the revised Schrammel and 
Sonnenberg index number’ (the form simi- 
larly based on ranked components) is .865 

The purpose of preparing the data 
Table VIII was, Lowever, not primarily to 
present another set of state index numbers, 
but to afford a comparison with the data of 
Table II, sc as to index 
numbers of ranking component series instead 
of taking them at actual values. The com- 
parison between these two sets of resulting 
index numbers is analyzed in the following 
paragraphs. 


* See notes | 


1or 


show the effect on 
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TABLE VIII 


RANK OF STATES ON INDEX NUMBERS DERIVED FROM VARIOUS WEIGHTINGS OF THE RANKS oF 
Frve TRAITS SUGGESTED BY THE NATIONAL EDUCATION ASSOCIATION 


Rank of States on Index Numbers From Each of § 
Weighting Patterns 


Equal (0) A B C D E 


Alabama 15.5 45 45 44 46 14 
Arizona a 29 27 31 32 24 sO 
Arkansa ; 10 40 44 39 41 43 
Calif i l l l 1 l ] 
( ra 22 22 27 24 23 2 
Conne t 13 11 5 25 12 
Delawa - Sestheniial 23 28 21 28 21 17 
Florida ai 37 37 37 35 39 35 
Georgia 47.5 47 46 47 47 47 
Ida} 24 18 29 17 22 27 
I 10 13 9 13 10 5 
na 17 23 20 14 14 15 
[owa 18 21 18 12 20 2 
Kansa 26 26 28 21.5 26 2 
Kentucky) , 41.5 41 41 43 40 45 
Louisia 13 14 42 46 42 
Mair 27 30 22 21.5 28 28 
Maryland - 34 33 32 38 31 
Ma ! tt 3 6 3 8 3 
Mict i a 5 8 4 3.5 6 i 
Minnesota ‘ Pe 14 14 18 19 5 
Miss ippi kate 15.5 46 47 41 45 4 
Missour : 31 31 26 30 30 29 
Montana 11 } 11 9 13 14 
Nebraska 0) 17 23 15 25 2 
Nevada . 4 2 8 2 5 
New H: shire 19 24 16 ») 18 2 
New Jerse 13 15 7 23 11 7 
New Mexic 36 38 36 34 35 7 
New Y 2 3 2 5 2 
North Carolina 14 43 13 15 44 41 
North Dakota oe 0 25 30 26 32 $1 
Ohi = Se 7 6 12 6 4 3 
Okiahoma SS ne 33 34 35 31 34 34 
Oregon a eee 3 4 15 10 8 10 
Pennsylvania — 21 20 19 27 17 19 
Rhode Island 28 29 25 33 29 24 
South Carolina : 47.5 48 48 48 48 48 
South Dakota ; ee 25 19 24 19 27 25 
Tennessee 39 39 39 40 38 39 
Texas -- poseuwens 38 36 38 36 37 32 
Utah iP ; 9 10 10 3.5 9 18 
Vermont -............- tens Te 32 33 29 33 33 
Virginia z a ee 41.5 42 40 42 43 42 
Washington Ee ee eee 7 7 13 7 7 1! 
West Virginia —- ee : 35 35 34 37 36 36 
ji ee ; : Oita 15 16 12 16 16 12 
13 5 17 11 15 13 


Wyoming ___- Ce ee neaeniaaielis 


The code for the weighting patterns applied to the ranked component traits is given in 
Table VII. When the five component traits are taken at actual, rather than rank, values, the 
results are those shown in Table II. 
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Efiect of Ranking the Component Traits 
Ranking makes series have the same vari- 
‘lity (with negligible differences arising 
m an occasional tie), when the populations 

constant, as in the present case. The 

ries are therefore equally weighted when 

y have been converted into ranks. Rank- 

. however, imposes an additional element 

that all of the values in a series are equally 

stant (one rank apart) instead of differing 
various amounts. This fact operates to 
lify the effect of weights on the series. 
Whereas a series of actual values might have 
extreme value—even though the series is 
more variable than other series are, with 
respect to the average variability for a series 
which would dominate the placing of that 
particular case in the index number, this could 

t occur in the case of ranks. 

On the other hand, differences between cer- 

pairs of values may be very small in the 

ise of series of original values; when these 

series are ranked, no case will differ from an- 
ther by a smaller amount than any other case 

will. In other words, whereas equalizing the 
standard deviations of series will equalize 
their variability (and weight) on the average, 
ranking goes further and adds to the equal 
variability the element of equal differences. 
Equal weighting, then, for ranked series, is 
not a matter of an average weighting for the 
series, but is a matter of uniformly equal 
weighting throughout the series. Whether or 
not this additional element is desired depends 
on what is wanted. In the present study we 
are concerned only with observing its effect. 


The difference produced in index numbers 
by ranked and unranked component series in 
the present case is represented by a (rank) 
correlation of .g88, when the component series 
are equally weighted. That is, index num- 
bers based on equally weighted component 
series, using actual (unranked) values of the 
component series in the one case, and ranked 
values in the other, correlate to the extent of 
988. The amount by which this coefficient 
is less than unity represents the difference in- 
troduced by the ranking of the five compo- 
nent traits before they are summed to form 
index numbers. 


The effect of ranking varies under the influ- 
ence of different weighting patterns. Corre- 
lation coefficients, similar to the one just de- 
scribed, for the various weighting patterns are 
shown in Table IX. It will be noted that the 


correlation is the highest for weight E, and 
the lowest for weighting A. The explanation 
is largely to be found in trait V, which re- 
ceives the least weight (1) under pattern E 
and the greatest weight (11) under pattern A. 
It happens that trait V is extremely skewed, 
more than half of the cases lying in the top 
interval (Table X). This trait represents one 
of those cases in which ranked data must 
necessarily depart significantly from the dis- 
tribution of actual values —for the ranked 
data are distributed uniformly along the scale, 
and cannot bunch at one end. 


TABLE IX 
CORRELATION BETWEEN INDEX NUMBERS BASED 


ON ACTUAL VALUES (TABLE II) AND ON 
RANKS (TABLE VIII) OF COMPONENT SERIES, 


WHEN VARIOUS WEIGHTING PATTERNS ARE 
USED 
Coefficient 
Weighting (rho) 
0 988 
A 969 
B 9R2 
( 988 
D .990 
E 994 
TABLE X 
DISTRIBUTION OF ACTUAL VALUES FOR 
TRAIT V 
.990-—.999 _ ee ae See ee er aw ee 
-980—.989 ¥ alata al Ee als . 5 
OF, Ee Se ee 
0 SS ee A ean — = 
CS eee ee eee a 
SEE -ccnccaamedeueen a es —— 
EE Se ee a ata a ae 
.920-.929 ... Re elegs 
IIE cic; sienna gis dade ieieabeaiaanite mens and 1 
[CS aa a eee oe ae a~ 
BO ee ee en ee ney nee - 
SEED cnnnomas PR. SER te ee ae 
SO i ia 1 
.860—.869 ee eee 2 
RII ics cictisandentnitis tankniahshiiedaeee eeehaaenibipiaaen 1 
48 


While such a departure from the bunching 
of actual values occurs in most distributions 
when they are ranked, it usually occurs 
toward the middle of the range, and not at the 
end. Forced departures of the shape of rank 
distributions from the shape of the actual data 
are more disturbing when they occur at the 
ends of the distribution, because they lower 
the correlation more. In the present case the 


(product-moment) correlation between the 








/ )/ / \ 1 / Cf 


t ii and the! ink 
ignificant drop trom pertect corre 


values of trait \ by itself 


ition. | wav of contrast, the correlation 
etween the actual and ranked values of trait 
I\ 8< which is very satisfactory. The 
tributior f actual values of trait IV is 
htly rectangular—the kind of distribution 
nanges little in characteristics when it is 
ted into ranl lrait III has a corre- 
61 between its actual and ranked 
Correlations for traits I and II were 
ited: but the frequency distribu- 
ve ar to those of traits III and IV. 

, tions Between Index Number 
(he foregoing section dealt with the rela- 


between the index numbers computed 
from two different forms of component series. 
onsider the interrelationship 


We may Vv ( 
between the index numbers derived from the 
ranked yonents, t certain the extent to 
vhich the ranking has disturbed their internal 
relationsnif 

The correlations between the index num- 
bers which are based on the variously 
weighted ranked traits are presented in Table 


XI. It will be seen that these values are of 
the same general order as the rank correlation 
efficients presented in Table III. The mean 
for the left side of the table is only .oo2 

lower than the corresponding figure in Table 
III, and the mean for the right side is .007 


value 


lower Although there are a number of minor 
shifts in the relative positions of the coeffi- 
ients fi Table IIT to Table VII, it cannot 
be said that the ranked series respond very 


EXPERIMENTAL EDUCATION Vol. 6, N 


differently to the weights which were used 
from what the actual values did. The st 
told in the two tables is almost identical 


Displacement in Rank Positions 

If we take the five top ranking states 
each of the six series of index numbers 
shown in Table XII, we find a general, but 
not complete agreement with the lists in Ta 
IV. Out of the thirty positions in the six 
umns, eight have dropped below fifth rai 
and have been replaced by other states 
Table XII. The agreement in content of the 
two tables is 73 per cent. The average ran! 
(according to Table II) of those states wl 
have dropped out of the first five in Tab! 
XII is 4; the average rank (in Table IT) 
the eight states which replaced them is 7.7 
a difference of nearly four ranks on the aver 
age. On the other hand, changes in rank 
position among the first three states of ea 
column are slight. 

In lieu of presenting complete tabulations 
for this study corresponding to Tables V an 
VI, we may compare the general averages for 
such tables. Since the correlation values 
Table XI are only slightly lower than those in 
Table III, it would be expected that the ave: 
age displacement caused by different weight- 
ings would be only slightly larger. In com- 
parison with average differences of 1.8 and 
2.8 for Table V, the corresponding values for 
the present study are 2.1 and 3.2. The aver- 
age minimum difference between the rank 
the states from one index number to another 
is .o2 in Table VI, and .r in the present 


TABLE XI 


INTERCORR 


WEIGHTINGS OF FIVE RANKED TRAITs. 


between index numbers based on 
weighted ranks of traits (0) and index 
nbers based on specially weighted ranks of 


aits (A-E) 
Rank 


erns Correlation 
D 994 
OF h 980 
OA 979 
0.B al 977 


958 





ELATIONS AMONG INDEX NUMBERS DERIVED FROM DIFFERENT 


N.E.A. DATA 


Correlations between index numbers based on 


various pairings of specially weighted ranks 
of traits (A-E) 


Weighting Rank 

Patterns Correlati 
eee ee .979 
RR a a eee .978 
Oe ee Se ee ae .972 
4 ee a ee aes 963 
ECR oee aaa 953 
SS eee aS 950 
CS) Se eee .942 
OE nc ae a en, ee a a 936 
ltt “ateitaalavenaitaiaatehaieacacanitediciis 913 
ES AE Ri er .909 
Bree ee Cee .950 





INDEX 





TABLE XII 

RANKING STATES ACCORDING INDEX 

{BERS DERIVED FROM DIFFERENT WEIGHT- 
FIVE RANKED TRAITS. N.E.A. DATA 


- 
iv) 


A B Cc D E 
Calif. Calif. Calif. Calif. Calif. 
Nev N. ¥ Nev — os Mee ee 
N.Y. Mass. Utah Mass. Ohio 
*Ore. Mich. *Mich. Ohie *Mich. 
Wyo. *Conn. N. Y. Nev Mass. 
tates were not in this column in 

1\ They represent changes in the 

‘omprise the top five according to 
ndex number, due to taking ranks 
ponent series rather than actual 
average maximum difference for 

\I is 6 ind is 6.6 for the present 


rious results in this second experi- 
ist be thought of as applying to 
ked components which have an average in- 


tion of .670 he average intercor 
the original (unranked) values 

from Second Stud 
f nount of change in relative position 


by ranking a series depends upon 

ipe of the frequency distribution of the 
| values. It is possible that no change 
tive position at all will occur. Rec- 
lar (flat) distributions cause least 
n relative position when converted in- 
violently skewed J-shaped distribu- 
correlations most when converted 

s. Observed product-moment cor- 
ranks with actual values of the 
ries ranged from .g85 down to .838. 


cs of 


P Sé 


Expressing the original values of component 
uits in terms of ranks should not ordinarily 
iny material change in the resulting in- 
umbers. Even those distributions which 
ire markedly altered when ranked will not be 
hcantly disturbing to an index number 
weighted heavily. Correlations _be- 
tween index numbers based on ranked values 
on actual values of component series 
inged from .994 down to .969, according to 
the weighting pattern applied to the ranked 
When the heavy weight falls on a 

series that is highly skewed, the resulting in- 
lex number affected more. For equal 


LUS€ 


ley 


INiess 


series 


is 





\l 


WBERS 


weights, the correlation coefficient was .988 
(Table IX) 

Che average intercorrelation between index 
num! different weightings of 
ranked components (Table XI) is 


identical with the average for index numbers 


\ 
i 


vers based on 


almost 


based on actual values of component series 
(Table III). The internal relationships be- 
tween index numbers do not appear to be 
markedly disturbed by the ranking 

[he ranking of component traits does cause 


hanges in the index numbers even when 
equally weighted. 
seem to be as large as the possible dis« 
ising from other, unknown cause 
(such as lack of validity). 

Whether the changes arising from rankin 
are in a desirable direction, and represent a 
gain rather than a loss, is a matter for judg- 
ment. Without an accepted external crite- 
rion, statistics can only analyze and measure 


some ¢ 
} 


ld + 
uld n 


These changes w 


repan- 


cies al 


the amount of the change: statistics cannot 

pass upon its desirability. 

[1J. ScHRAMMEL AND SONNENBERG’S DATA 
It was desired to carry on the study of the 


of constant weights on index numbers 
a larger number of series. The data pub- 
by Schrammel and Sonnenberg,’ giv- 
of states based on eleven 


effect 
with 
lished 
ing an index number 
traits, were employed in a third study which 
follows in general outline the first two 

rhe pattern of weights had to be adjusted 
somewhat to cover the eleven traits, but was 
kept as nearly like the patterns previously 
used as possible. The weighting patterns for 
this third study are shown in Table XIII. 
In the present experiment, the weights were 
rotated the same as in the first two studies, 
but were moved two series at a time (three 
series in one case) so as to produce five dif- 
ferent special weightings, as in the first two 
experiments. 

This and the remaining studies will be pre- 
sented briefly, since the general outlines of 
the attack and the incidental analyses are now 
obvious. The rank of the states according to 
the resulting index numbers will not be pre- 
sented but analyses of the table will be given 
The equally weighted index number has of 
course already been published. 

The Results of Varying the Weights 

Intercorrelations among the index numbers 


resulting from the changes in weighting pat- 
terns are shown in Table XIV. It will be 





OF EXPERIMENTAL EDL ¢ 


17/O.\ [Vol. ¢ 


TABLE XIII 


THE ELEVEN 


IV ’ VI 


III 


TRAITS OF SCHRAMMEL AND SONNENBER 


VII VIII Ix X 
1 , 


i 
6 
6 


TABLE XIV 


\ MONG 
RANKED TRAITS. 


RCORRELATIONS 


Correlations between index numbers based on 
weighted ranks of traits (0) and index 
numbers based on specially weighted ranks of 
traits (A-E) 
Weighting 
Patter Correlation 
OE 958 
0,( 949 
0.B .940 
0,A 935 
0,D 904 


equal 


Rank 


Me il .937 


INDEX NUMBERS DERIVED FROM DIFFERENT WEIGHTINGS OF ELEVE> 
SCHRAMMEL AND SONNENBERG DATA 


Correlations between index numbers based 
various pairings of specially weighted ranks 
of traits (A-E) 


Rank 
Sorrelatior 
916 


Weighting 
Patterns ( 
D,E - 
A,B 
»C 
A,C 
,E 
A,E 


E 
I 
I 
I 


A, 


Mean 


TABLE XV 


DIFFERENCES IN RANK POSITIONS OF THE 
WEIGHTINGS OF ELEVEN TRAITS. 


FORTY-EIGHT STATES RESULTING FROM DIFFERENT 
SCHRAMMEL AND SONNENBERG DATA 


Averages are for the group of forty-eight states 


Equal Weights (0) and Special 
Weights (A-E) 

Pairing of Corre- Aver. Min. Max. 
Index Nos. lation Diff. Diff. Diff. 
0—A . .935 3.6 0 12 
0-B ‘ .940 3.4: 0 13 
0—C 949 f 0 15 
0—D 904 4.38 0 16 
0-r oo wee 3.20 0 10 


Mean 937 3.56 0 13 


noticed that these are much lower than those 
ybserved in the first and second studies. The 
average correlation on the left side of Table 
XIV is .o5 lower than in Table III, and the 
average for the right side is .r1 lower. In 


Special Weights (A-E) 


Aver. Min. 
Diff. Diff. 
4.08 0 
4.42 0 
7.06 0 
5.48 0 
4.17 0 
6.60 1 
5.38 0 
6.02 0 
5.77 

4.60 


Corre- 
lation 
. 213 
.893 
.776 
.863 
910 
-784 
873 
.842 
\-E .852 
)-E .916 


Pairing of 

Index Nos. 
A-B _-. 
A-C 


Mean -_- .862 5.36 19 


fact, only one value in Table XIV exceeds 
the lowest value reported in Table ITI. 

The amount of variation between the place- 
ment of the states by the differently weighted 
index numbers is shown in Table XV. The 








TABLE XVI 





IN RANK POSITIONS OF EACH 


FERENCES : 
“TATE RESULTING FROM DIFFERENT WEIGHT- 


vcs OF ELEVEN TRAITS. SCHRAMMEL AND 


NNENBERG DATA 


Average Minimum Maximum 
Difference Difference Difference 


a 3.1 0 ‘ 
a 5.5 5 13 
insas 2.9 0 6 
rnia _ 5.4 l 11 
ee ae 9.1 0 18 
ecticut ‘ 6.8 1 16 
aware 10.0 ] <0 
1 1.5 0 3 
‘oorgia 3.3 1 7.5 
a 5.5 0 12 
6.9 0 16.5 
" 5.7 5 15 
' 6.4 1 14.5 
is 8.2 2 19 
cky -- 1.9 0 5.5 
siana 3.2 0 7 
¢ 2.9 0 14 
faryland _.-. 5.7 0 14 
lassachusetts 3.5 0 8 
higan -- 4.6 1 10.5 
nesota ai 1.6 0 4 
SIsSIppl — 3 3 0 ras 
Missourl -- 2.5 0 6 
tana 1.6 0 4 
N aska 33.3 1 28 
Nevada . 2.4 5 5 
New Hampshire 6.6 1 15 
New Jersey -.. 10.0 0 20 
New Mexico —- 6.6 0 13 
New York —_- 8.9 0 20 
North Carolina 1.8 0 4 
North Dakota- 8.2 0 18 
J a ee 7.5 2 16.5 
Oklahoma » 3.9 0 & 
Oregon ..--.. 7.6 1 18 
Pennsylvania_- 5.3 1 14 
Rhode Island_- 3.3 1 7 
South Carolina 1.0 0 3 
South Dakota_ 8.2 1 16 
Tennessee ____ 1.8 0 4 
WN ie wicca 3.1 0 6 
i 1.2 0 8 
Vermont __--- 5.5 1 13 
Virginia  ___- 2.0 0 5 
Washington _-_ 2.0 0 4 
West Virginia 1.5 0 3 
Wisconsin ___- 5.1 5 11 
Wyoming 9 0 2 
Mean 4.78 4 10.8 


“Average Difference” is the mean of the 
(absolute) differences for all possible pairings 
of the six series of index numbers based on the 
six weighting patterns shown in Table XIII. 
The Average Difference for each state is based 
on 15 differences. 
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mean maximum difference for the left side is 
13, as compared with 7 in Table V; and the 
difference for the right side is 19, as compared 
with 11 in the earlier table. 

Table XVI gives the amount of variation in 
rank from one index number to another, for 
each individual state. It will be seen that 
the values for the table are larger than those 
for Table VI. The two sets of figures are 
as follows: 


Table VI Table XVI 


Mean Average Difference__ 2.5 4.8 
Mean Minimum Difference 0.02 0.38 
Mean Maximum Difference’ 6 11 


The average difference and the maximum dif- 
ference are roughly twice as great as for the 
earlier data. 


Analysis of the Results 

Inspection of Table XIV reveals that the 
index number having weighting D correlates 
lowest with any of the other index numbers; 
it is at the bottom on the left side and occu- 
pies the three bottom positions on the right 
side. Interestingly enough, it also occupies 
the highest position on the right side. The 
only inference is that weighting E deviates 
from the rest of the index numbers in the 
direction of D sufficiently to give the highest 
correlation in the table. 

The crux of the results found for the 
Schrammel and Sonnenberg data lies in the 
extremely low intercorrelations between the 
various traits which were employed. To use 
a single state as a suggestion of what happens, 
New York has the following ranks in the 
eleven traits: 1, 2, 2, 4, 9, 15, 18, 27, 38, 
44,45. A number of other states share simi- 
lar vicissitudes. The picture for the entire 
set of forty-eight states is given by the fol- 
lowing correlations: 


and the sum of the 
remaining traits 


Correlations between the 
sum of traits 


Be ere 111 
LL 427 
VE. PEE Sicncnnms .378 
i ea 549 
i ae hie .208 


Average intercorrelation be- 


tween the eleven traits_- 132 


When relationships between component 


traits are this low, the effect of weights will 
be marked, and precise analysis can not be 
That is, the correlations for the dif- 
weighting patterns 


made. 


ferent will not follow 


JOURNAL 
closely the magnitude of the intercorrelations 
riginal series because, when correlation 
there are too many different 
tures, so that roughly, almost 
anything may happen. It may be wondered 
that correlation coefficients averaging around 

lable XIV) would be found for the dif- 
ngs of such unrelated data. 
tht be proper to raise a question, in 
the desirability of including 
» a very low correlation with 
_ in an index number for rat- 
It is true that the philosophy 
in psychology, is to find 


of the 
value ire iow, 


DOSSIDI€ 


in; lysis 
» do not have anything in common 
itther: but it is also recognized that 
1 when found, be fundamental 
They will 
aspects or parts that can 

rved; they will not be such 

in get quantitative data on for 

in index number of this sort. It 
within reason to expect traits rep- 


surface effects.” 


fferent observable phases of “good- 
ymplex phenomenon, such as state 
activity, to correlate pretty well 
1ases—taking all of the states as 
a minimum correlation of 
would be a criterion. 
ncludes traits that bear little rela- 
it would seem 
eason to give such traits a 
all traits individu- 


Pe rh Ips 
suitable 


he rest of his traits, 
yy ? 


. f 
g. If 


ot the 


low relationship with one another, 


the rest of the traits as a group, then 
uld question whether he has selected 
raits that bear upon a homogeneous concept. 


~~, * DATA 


irth study was made on a still larger 
number of traits, and a larger population. 
Chamberlain'’ published data on the educa- 
tional activities of 120 counties in Kentucky. 
He used 15 traits, weighted them equally, and 
col them, giving the median of the 
sigma values for each county. He also gave 
“ach trait, which were used 
in the present study. Because data were in- 
twelve of the counties, these 
dropped, leaving a population 


HAMBERLAIN § 


ymt ned 
rank values for 


complete for 
Cases were 
ot 103 

lhe weighting patterns are shown in Table 
XVII. They were moved three columns 
(traits) at a time, resulting in five special 
weightings in addition to the equal weight- 
ings, as in the previous studies. When this 
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was done, the series of resulting index num. 
bers yielded correlation values as shown 
Table XVIII. These values are practical, 
as high as those given in Table III, being 
the average only .o12 lower for the left halj 
of the table, and .oog lower for the right sid 

An analysis of differences was made, sim 
lar to those given earlier in Tables V and X\ 
In this case, the frequency distribution was 
made for all of the five or ten pairings 
index numbers together, rather than giving 
the average difference separately for eac! 
pairing. It will be seen that there are a fey 
differences that are very large; on the other 
hand, 60 per cent of those in the left column 
and 42 per cent of those in the right columr 
are differences smaller than six. In consider- 
ing differences in this table, it must be borr 
in mind that there are 108 cases instead 
48. Each rank position is therefore a smaller 
portion of the range. Reducing the average 
difference to comparable terms with those 
the preceding tables, the values become 
and 4.1 ranks, respectively (based on a p 
ulation of 48). These values are appr 
mately 50 per cent higher than those 
Table V. 

The average intercorrelation of the origi 
15 series of ranked data is .524. 


V. TEACHERS Cost-oF-LIVING INDEX 


It was desired to include in this gen 
analysis an index number of the econo 
type, in addition to the four index numbe 
for rating purposes. For this fifth stud 
data were taken from a National Educat 
\ssociation research bulletin’* on the cost 
living for teachers. The principal quest 
is, How much will this index number va 
under the influence of weighting patterns suc! 
as those used in the preceding experiments’? 
The question is in part answered by data 
contained in the Research Bulletin’ which re 
views six other index numbers differing chietly 
in the weights which are used. Five of these 
are graphed, showing the variations which re 
sult from differences in weighting. In th 
year of largest disparity (1932-33) the maxi 
mum difference was 10.6 per cent, the mini 
mum difference was 0.5 per cent, and the 
mean difference was 5.2 per cent.’® Thi 
weighting patterns used in these different i: 
dex numbers are somewhat difficult to pre 
sent, because the categories varied. Perhaps 
the present experiment will provide a sufi 
cient number of comparisons, without an at 
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Rank 


Correlation 


ES IN RANK POSITIONS OF THE 
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between index 
ngs 0,A; 0,B; 0,C; 
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Q 1 
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SE oe 30 
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TABLE XVII 
PATTERNS APPLIED TO THE FIFTEEN 
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VIII XII 
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, TABLE XVIII 


LATIONS AMONG INDEX NUMBERS DERIVED FROM DIFFERENT WEIGHTING 
CHAMBERLAIN’S DATA 


Correlations between the in 


OTUDY 


XIII 


lex numbers 


XIV 


XV 


6 6 
5 5 
i 1 
] l 


$3 OF FIFTEEN 


on various pairings of specially weighted ranks 


of traits (A-E) 


Weighting 


Pattern 


C-—E 
A-C 
B-E 

A_pD 
B-—C 
A-E 

—D 


( 

D-E 
A-B , 
Be .... 


Mean 7 oe : 


XIX 


1 
ts 


COUNTIES RESULTING FROM 


CHAMBERLAIN’S DATA 
index 


A,D; 


Differences between 
weightings A,B; A,C 
8,E; C.D; C.E; D,E 


Class Interval 


41.5 
36—38.5 
Ded 2) ) 
30 32 5 
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24—26.5 va —_ a wie 
21-—23.5 —- _ ‘i 
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tempt at analyzing the weightings of the other 
index numbers published, which would neces- 
sarily be unsatisfactory because of too many 
basic differences. 

The weighting patterns used in the present 
experiment are shown in Table XX. The 
weights used by the N.E.A. in the index num- 
ber computed by the Research Division are 
also shown since comparisons are made with 
that index number. Since the N.E.A. weights 
were applied to the series as they stood—each 
series with its natural weighting, the weights 
ire not directly comparable to those which 
have been used throughout the experiments in 
the present study, which were all applied to 
series that were equally weighted to begin 
with. These N.E.A. weights are therefore 
shown in two forms in Table XX—first the 
nominal form, and second the effective form. 
rhe latter represents a combination of the 
nominal weights and the natural weights of 
the various series, thus giving the weights 
that would have been applied to equally 
weighted series to produce the same effect.*’ 
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All of the weighting patterns shown have been 
divided through by the smallest 
(greater than zero) so as to reduce the sm 
est weight to unity and thus facilitate inspec- 
tional estimates of the relative force of differ- 
ent sets of weights, insofar as this force may 
be indicated by the ratios of each set. 

The results of the various weightings ar: 
shown in Table XXI._ It will be noted from 
an inspection of the table that weighting A 
gives the highest results, B the next highest 
and H the lowest (especially for the last 
year). Weighting O, as usual, gives “middk 
of the road” results. Weights A and B em- 
phasize trait VIII, which, even though the sig- 
mas have been equalized, is a “slow moving 
trait, being 100 or above for three years 
the six years. Weighting H, on the other 
hand, gives least weight to trait VIII, and 
produces the lowest values. Weights of the 
other traits, of course, also enter in to effect 
the results observed. 

It is interesting to observe that, with a sin- 
gle exception in the table, all of the inde, 


TABLE XX 


WEIGHTING 


PATTERNS APPLIED TO THE EIGHT ITEMS 


(TRAITS) ENTERING INTO THE “ 


CosT-0F-LIVING INDEX NUMBER 


Cloth- 
Weighting Food ing Rent 
Pattern Il III 
Natural* a i 3.4 4.5 
N.E.A. Nominal** : 6.0 5.9 
N.E.A. Effective? 3 5.6 
0 1 


I 
( 
I 
I 


‘ee 
leew 


G’ 

H 

* These are the ratios of the standard 
standard deviation greater than zero. 


deviations of the 


Fur- Trans- _Inter- Mise 
Light nishings portation est laneous 
IV V VI VIISE VIII 
2.1 3.6 | 0 1 
2.3 2.7 7 1 3.6 
1.0 2.0 , 0 
1 1 1 
6 6 9 
7 
6 
6 
5 
3 
11 1 3 
3.7 ~~ 1 
9 11 1 


Fuel, 


various traits to the smallest 


** These are the (ratios of the) weights applied by the N.E.A. to the series as they stood, 


each series having its natural weighting. 


These weights are those reported by the N.E.A., 


divided through by the smallest weight so as to make the smallest value unity. 
2 These weights represent the combined effect of the natural and nominal weights; they 
are the ones which should be taken as the real weights affecting the traits entering into thé 


N.E.A. index number. 


weights, and dividing through by the smallest product greater than zero, so as to reduce t 


smallest weight to unity 


They are obtained by multiplying the natural weights by the nominal! 


the 


$2 The observed series for trait VII had zero variability; that is, all of the values in th: 


were the same. 
weights applied to it. 


series 


Its effective weight therefore becomes zero, regardless of any nominal 
It has no effect in the positioning of any case (year) in the index num- 


ber. It was omitted in the experimental work, and the entire column of weights could be 


dropped from the table. 


Since in pattern G the base weight (1) becomes ineffective, a row G’ 


is given, showing ratios to the smallest effective weight in that pattern. 
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TABLE XXI 


CostT-OF-LIVING INDEX NUMBERS DERIVED FROM VARIOUS WBEIGHTINGS 


OF EIGHT TRAITS 


N.E.A.* 0 A B Cc 
100.0 100.0 100.0 100.0 100.0 


EXPERIMENTAL 


G H 
100.0 100.0 
98.6 98.4 


D E F 
100.0 100.0 100.0 
99.0 99.0 99.1 


98.6 99.2 
94.5 
88.8 
84.5 
88.1 


91. 
83.5 
77.3 


80.6 


99.0 
94.7 
89.3 
84.6 
87.3 


99.4 
96.4 
92.6 
88.6 
90.2 


99.3 
95.2 
90.4 
86.3 
88.5 


94.5 
88.6 
83.4 
85.4 


95.2 
90.1 
84.7 
86.6 


94.7 
88.4 
82.7 


85.8 


93.3 
87.6 
83.1 
86.8 


92.8 
86.5 
82.1 
86.0 


As given by the Research Division, p. 236, of the reference cited in footnote 14. 


the start that the patterns should have been 
reversed and rotated; but facilities were not 
available for following up all of the possibili- 
ties which presented themselves. It was 
thought best, within the scope of the work set, 
to follow the one pattern systematically. 
The data in Table XXI are analyzed in 
Tables XXII and XXIII. The average dif- 


ference, over the six years, for any two pairs 


rs for all of the years are higher than 
N.E.A. index numbers. This fact brings 
nto relief one of the deficiencies of the sam- 
pling provided by the weighting patterns 
used. It is possible so to arrange the weights 
n any one of the patterns A—H, that an index 
number value lower than the N.E.A. values 
For example, the pattern: 11, 7, 
3 does this. It was realized from 


result. 
5, 6, 


will 


TABLE XXII 
VALUES OF CosT-oFr-LIVING INDEX NUMBERS RESULTING FROM 
WEIGHTINGS OF EIGHT TRAITS 


FFERENCES IN THE DIFFERENT 


Averages are for the six years 


Equal Weights (0) and Special 
Weights (A-H) 


Aver. Min. 
Diff. Diff. 
. 2.05 0 
.80 0 
.30 0 


Special Weights (A-H) 


Aver. 
Diff. 
1.58 
2 99 
2.71 
1.93 


Pairing of 


Index Nos. 


a 
IO 
S) 
Oho to 





.67 
35 
.73 
91 
1.50 


Differences are in units of per cent, the base year being 100%. 
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columns in Table XXI, is given in Table 
XXII. together with the minimum and maxi- 
mum difference Differences in which weight- 
g patterns A, B ind H are concerned are the 
irge one ipart from these, there is only one 
verage difference greater than 1.00, and only 
m differences greater than 2.00. 


maxim 

In Table XXIII are given the 
other direction—across the different index 
imbers, in all possible pairings for each year. 
Each “average difference” in this table is the 
average of 36 differences, between the col- 
umns of Table XXI, for a single row. Look- 
ing at it another way, each average difference 
in Table XXIII represents the average of the 
36 pairings shown in Table XXII, but broken 
down for only a single year, at a time. When 
proper weighting is applied, and errors ac- 
cumulating from the rounding of decimals are 
allowed for, the average for the whole table 
should equal the combined averages of Table 


XXII 


averages in 


TABLE XXIII 
FERENCES IN CostT-oF-LIVING INDEX 
NUMBERS FOR EACH YEAR, RESULTING 
FROM DIFFERENT WEIGHTINGS 


the 36 different 
for that year 


Min. Max. 
Diff. Diff. 


1.0 
3.6 
6.0 
6.5 


5.8 


1 od 4.6 


Differences are in units of per cent, the base 
vear (1928-29) being 100 per cent. Base year 
n which the differences were zero is not in- 

ided in the averages 

Che large maximum differences in Table 
XXIII are due to the three weighting pat- 
terns already noted as causing extreme val- 
ues. Aside from these three patterns the 
maximum differences for the five years are as 
follows 0.8, 2.4, 3.6, and 2.7; and 


2.0 
, ‘ 
average differences are much less. 


In other 
words, when weights are applied which do not 
emphasize the idiosyncrasy of some unusual 
trait, differences in the resulting index num- 
ber values are not likely to exceed the limits 
of reasonable fluctuation. Cost-of-living in- 
dex numbers are not expected to be accurate 
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within three or four per cent when 
over a large area, or to different 
groups; they probably are not accurate y 
in that range even for a specific locality 
particular expenditure level, on a 
ordinary problems of sampling. 

One must of course note that there i 
ference of 11.3 between the N.E.A 
1932—33 and the index number for 
pattern A, as shown in Table XXI; a 
excluding patterns A and B there is 
ence of 7.5 for 1933-34 between thx 
of the N.E.A. weighting and weightir 
Such differences are larger than one 
care to have in his index numbers. | 
must be remembered that the weighting 
terns in Table XX are assigned mechani 
and do not represent any intelligence 
has merely to consider: Is it likely that 
one would, by the exercise of judgment. 
sign only 2 per cent (1/48) of the total 
penditure to rent, as is done in pattern | 
(Rent is the only trait which makes a signif 
cant drop in 1933-34; all of the patterns 
which accord it little weight are there 
high for this year.) The differences w 
are shown in Tables XXI-XXIII are n 
be taken as indicating the amount of ert 
likely to arise by virtue of reasonable weight 
ing, but rather as the amount of differ 
that might arise when weighting is unreas 
able: to wit, pattern A with its 2 per 
allowed for food. 

It may be pointed out in passing tha 
method used by the N.E.A. in handling t 
VII, which has no variability, while it is t 
orthodox method in economic index numl 
operates in the direction of lowering the ' 
ability of the resulting index number. 
is, including a constant series tends to kee; 
the index number slightly nearer the base year 
(or, more exactly, nearer the constant value 
of the trait in question). If, on the 
hand, the trait is omitted from the summa- 
tions, on the basis that its variability is zer 
and therefore its effectiveness in the position- 
ing of any case (year) is nil, the index num- 
ber derived from a given set of weights will be 
somewhat livelier. Logic may, however, 
quire the inclusion of such series, as it 
ably does in the present instance. 

Table XXI is not analyzed by the correla- 
tion technique, as in the preceding four ex- 
periments, because there are only six cases 
for each correlation, and the results would not 
be regarded as significant. The correlations 


+ + 
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obviously high, and if rank correlations 
calculated, most of them would be per- 
Another difficulty in the interpretation 
relation values for such an index num 
is that the range of observations is of 
moment in determining the value that 
und. The range of some twenty points 
or may not be regarded as a satisfactory 
iation in cost-of-living index numbers for 
purpose of calculating correlations. , This 
blem does not arise in connection with the 
set of states, or counties of a state, where 

the variability is complete—at least for any 
n time represented by the index number. 
is of interest for the present experiment, 
wever, to report that the average intercor- 
tion between the basic data is .821. This 
e is given, not as necessarily typical, but 
ly as a necessary factor in the considera- 


the results in Table XXII. 


[TyPE OF INDEX NUMBER CALCULATED 


\n index number is essentially a sum, or 
rage, of the weighted variables. Either 
riginal variables, or the average, is us- 
y expressed as a per cent of the values for 
e base year or place. There are many 
rent formulas which can be used for cal- 
iting an index number, and the values ob- 
ned depend somewhat on the particular 
nula used. After an elaborate analysis 
various formulas, Fisher’ selected eight 
h he regarded as the best. 
these eight ‘‘best’’ formulas become 
entical when constant weights are used,?” 
therefore provide a natural form to use 
the present work. The form 
ken by these formulas is: 


SIX Of 


common 


7 
oo 


NS 


w j;X 


more simply, but less specifically: 


lhe numerator of this formula indicates that 

various traits (X) are to be weighted (w) 

d summed for any given state (or year, 

when time is the independent variable); this 

imerator is then expressed as a ratio of a 

similar summation representing values for a 
ise state or year. 
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The weights are constant from state to 
(that is, constant over a column), but 
vary from trait to trait (that is, w and X are 
both variables in any index number 


Stale 


single 
ue). 

Base values are of course constant, and it 
was not necessary to divide through by base 
values for the first four experiments here re- 
ported, since such a division would not affect 
the correlations or the ranking. If it had 
been necessary, probably the index number 


based on average values for the 48 states 
would have been used as a base. In the fifth 
study, the values for 1928-29 were used as 


the base 

In economic index numbers the weights or 
dinarily vary from year to year. It does not 
seem appropriate to vary them in index num- 
bers for rating purposes; whether they should 
be varied for a cost-of-living index number 
may be debatable. 


WEIGHTING NEEDED 


been 


FURTHER STUDIES 01 

The present investigation has limited 
in scope, with the intention of presenting the 
effects of various weights under normal oper- 
ating conditions. It has not run down en- 
tirely and systematically any of the numer- 
ous avenues which have opened up as the work 
progressed. The following topics are 
gested as fruitful for statistical experimen- 
tation: 


Sug- 


1. Explore more fully the possible arrange- 
ments of a given set of weights. Only a small 
sampling was used in the present experiments. 

2. Study the effects of more forceful sets of 
weights 

3. Compare the variations 
weighting, and those produced by 
index number formulas. (Would 
involve variable weights. ) 

4. Compare the variations in any series of 


pr duc ed by 
different 
probably 


index numbers published for the states, pro- 
duced by different reasonable weights as- 
signed to the traits by different judges, with 
variations between the index numbers pre- 


pared by different workers. (A comparison 
of the effects of reasonable weights with the 
validity of the traits selected to compose an 
index number.) 

5. How should the force of a set of weights 
be measured? Various methods are possible; 
which method most closely indicates the abil- 
ity of a given weighted series to determine the 
position of a case in the final composite? 
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6. What is the reasonable limit of relative 
weight that can be assigned to a series having 
a given degree of uniqueness without fear of 
producing a change in the resulting com- 
index number, that might be seri- 
misleading? (Assumes one is _inter- 
ested in determining limits within which 
weighting may be done without fear of caus- 
ing significant error because of poor judg- 
ment. Within such limits many workers 
would be willing to weight, trusting that what- 
change was produced would be a ten- 


posite or 


ously 


ever 
dency in the right direction.) 


SUMMARY 


Sets of weights, having a maximum ratio of 

10 or 11 to rt have been applied to four differ- 
ent sets of data, and the resulting differences 
Che first set of data was used in 
two forms: actual values and rank values. 
rhe first three experiments dealt with index 
numbers for educational activities of the 48 
states: the fourth experiment was concerned 
with education in 120 counties of one state, 
fifth experiment was made with a 
cost-of-living index number. The results, and 
the principal analyses, are presented in a 
of tables as follows: 


analyzed 


and the 


series 


Weighting Patterns 

Inde x Numbers 

Intercorrelations of Index Numbers 
Average Differences, for Group 
Differences for Individual Case 


Exp. 1 


N.E.A. 
Actual 


{TION [Vol. 6, No. 
usually believed. It is necessary, however 
bear in mind the factors which determine the 
importance of weighting. 

The effectiveness of weighting appears to 
depend upon the following factors: (1) the 
force of the set of weights; (2) the unique 
ness (lack of correlation) of a particular 
series; (3) the shape of a series (as compared 
with the shapes of the other series, whe: 
plotted on some common scale); or the lin 
arity and general homogeneity in intercor: 
lations; (4) the number of series entering 
to the composite, or index number; (; 
whether the weights are constant, or varia! 

Weights are not to be thought of as eff 
tive solely at their face value. The charac- 
ter of the series to which weights are assigned 
appears to have as much to do with the effect 
of the weights as the relative force of 
weights themselves. 

When the heavy weights are attached 
series which are relatively unique, a signifi 
cant difference is almost certain to occur 
the placement of cases in the composite. 

The general effect of weights on a set 
series is probably in proportion to some 
verse function of the average intercorrelatio 
of the series. 

Exp. 3 Exp 
Schrammel 
and Son- 
nenberg 
XIII 


Exp. 2 Exp. 4 
Cost 
Living 
XX 
XXI 


Chamber- 
lain 


N.E.A. 
Ranks 
I VII XVII 
II VIII pie aaa 
III xI XIV XVIII 
V ae XV XIX 
VI aor XVI née 


XXII 
XXIII 


nditions and results may be briefly presented as follows: 


Population: Number of Cases - 
Number of Traits ~— 


Average Intercorrelation of Traits a 


Average Intercorrelation of Index Numbers__ 


Lowest Intercorrelation of Index Numbers - 
Averags 


Largest Difference in Placement of a Case 


CONCLUSIONS 
\n unrestricted generalization that the 
weighting of index numbers is important, or 
unimportant, is not justifiable. Weighting 
may, under certain circumstances be of the 
greatest importance. On the other hand, un- 


der ordinary conditions, moderate weighting 
probably is not of as great importance as is 


Exp. 1 


Difference in Placement of a Case ___-_ 


Exp. 2 
48 48 
5 5 
yf i 13 
98 ‘ : 
94 e .78 
5 ; 4.8 
19 _— 28 


Particular cases (as states, or years) that 
are heterogeneous with reference to the val- 
ues in the different traits, will individually 
show greater response to weights than cases 
in the series which are more homogeneous 
in their values. 


Ranking the component traits before 
weighting and combining into an index num- 
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has only a moderate effect on the place- 
nt of cases in the result. Ranking exerts 
nfluence in addition to that of making the 
variability of the series equal. The extent of 
: additional change depends on the shape 
the original frequency distribution, and it 
further acted upon by any assigned weights. 


Rank correlation coefficients are substan- 
illy the same as product-moment correla- 
coefficients for the data here worked 
the rank correlations are slightly lower, 

the whole. The difference between prod- 
oment correlation coefficients and rank 
relation coefficients is comparable to the 
fference which occurs in product-moment 
relation coefficients when calculated with 
tual values and product-moment correlation 
efficients when calculated from grouped 
ita. (The data in support of this point 
were not presented in the report, but appear 
n the work sheets.) 
Equal weighting appears to give results 
newhere around the middle between the 
rst possible weighting and the best possible 


ht 
nting 


Ve 


No differences in index numbers were pro- 
luced by any of the experiments which could 
with assurance be said to exceed the limits of 
wccuracy that should be allowed for validity 
the traits selected to represent the general 
ncept, and reliability of the reported data. 
\ set of weights is essentially a set of 
ratios. They may be divided or multiplied by 
stant without changing the effect on the 
placement of any case in the index number. 


Where the actual magnitude of the result- 
ndex number is important, in contrast 
with simply correct ratios between the val- 
es, or with rank position, additional factors 
and the situation is somewhat more de- 
inding. Multiplying or dividing the weights 
by a constant, however, will not affect the 
value of the index numbers if they are re- 
ferred to a base that is weighted accordingly. 


enter, 


Does it pay to weight? The question of 
weighting in any particular case must be de- 
ided some way. Natural (observed) weight- 
ng is largely a product of the units that are 
ised, if they differ from trait to trait. Equal 
weighting is arbitrary, and requires a decision 
is much as any other weighting. One may 


venture reasonable weights without fear of 
iffecting the results markedly, if the series are 
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well interrelated; and, if he believes that he 
can assign weights that are somewhat better 
than the observed weights or than equal 
weighting, he should do so, taking advantage 
of whatever improvement may result. That 
is, there is little risk in moderate weighting, 
and whatever change there is may reasonably 
be expected to be a favorable one. 


One will not be absolute in his interpreta- 
tion of index numbers, recognizing that many 
acts of judgment necessarily enter into them, 
and that the values yielded are no better than 
the quality of judgment which has acted upon 
them at many points. 


NOTES 

‘ Douglas E. Scates, “‘The General Nature and Applicability 
f Index Numbers for Education,” Journal of Experimental 
Education, 1V:265-78. March, 1936 

2 For of these index numbers, see the refer 
ence in footnote 1, or the following Statistical Analyses of 
State School Systems pp. 104-112; im ‘Estimating State 
School Efficiency Research Bulletin of the National Edu 
ation Association, Vol. X, No. 3, May, 1932 


: 


a brief review 


Complete bibliographical references to the various studies 
by Ayres, Phillips, and Schrammel will be found in the refer 


ences cited in footnotes 1 and 2 


*H. E. Schrammel, and E. R. Sonnenberg, ‘‘The Rank of 
States According to Educational Achievement on the Basis of 
en Selected Criteria,"’ American School Board Journa 


$3:17-19 November, 1936 
> Douglas E. Scates, ‘‘Revised Index Number of State Schoo 
Systems, American School Board Journal, 94:52-53 Tune 
1937 

* ‘Present Standing of State School Systems on Five Factors 
Related to Efficiency,’”’ pp. 113-131; im “Estimating State 
School Efficiency Research Bulletin of the National Edu 
ition Association, Vol. X, No. 3, May, 19 


*It will be noted that weights running from 1 to I! are a 
forceful as weights running from 20 to 220, or any others in 
like proportion 


* Douglas E. Scates and F. R. Noffsinger Factors Which 
Determine the Effectiveness of Weighting Journal of Edu 
ational Research, 24:280-85. November, 1931 

* Within the limits of sampling of the various weighting 
patterns viously pointed out All of the statements in 
this discussion must be interpreted as limited to the ynd 
tions underlying the present set of data 


us pr 
iS pre 


In the calculation of coefficients of this type, formulas 
given in the following references are helpful 

Herbert S. Conrad, “On the Calculation of the Correlation 
Between a Single Element of a Composite and the Remainder 
of the Composite,”’ Journal of Educational Psychology, 26 
611-615. November, 1935 
Ghiselli and George Kuznets, “‘Short-Cut Methods 
for Calculating Raw and Corrected Correlations Between a 
Composite Variable and Its Components.’’ Journal of Edu 
ational Psychology, 28:237-240. March, 1937 

™ Such a result would not necessarily be the case, since the 
rank correlation coefficient is based on squares of the differ- 
ences, and may drop more rapidly than the average of the 
differences increases It may even drop when the average 
difference decreases 

2 Data from the original report (note 4) were used, rather 
than Scates’ revision which was made after the present study 
Two obvious errors in printing were corrected 
however 


Edwin E 


was well along 
before the data were used 

1% Leo M. Chamberlain, “Measures of Educational Perform- 
ance in the County School Districts of Kentucky.’ Bulletin 
of the Bureau of School Service, College of Education, Uni- 
versity of Kentucky Vol. VI, No. 4, June, 1934 42 p 
Data from Table 1, pages 25-34 

1% “‘The Teacher’s Economic Position,”’ Research Bulletin of 
the National Education Association, Vol. XIII, No. 4. Sep 
tember, 1935. Chap. VII, “Changes in Cost of Living with 
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A TEST FOR MEASURING TEACHERS’ KNOWLEDGE OF 
THE CONDUCT AND PERSONALITY OF CHILDREN 
FROM SIX TO EIGHT YEARS OF AGE* 


\IARTHA A 


O'DANIEL Ri 


NSLAND 


State Supervisor of Federal Nursery Schools and Family Lije Edxcation, 
State Department of Education, Oklahoma City, Oklahoma 


I. INTRODUCTION AND STATEMENT 
OF THE PROBLEM 


[he newer trends in education are toward 
the development of the whole child. The 
vay in which he learns to adjust to the school 
environment is of more import, and signifi- 

nce to the educator than the teaching of 

ecific subject matter. While there is some 
sagreement among recognized authorities as 
the techniques used in child development, 
there is almost a unanimity of opinion as to 
ts importance. The newer conatus empha- 
size the importance of studying the child in 
| his learning situations. 

The becoming aware of something more 
subtle, more intrinsic, more potent, and of 
something of more moment than the three 
R’s has been exceedingly gradual, but quite 
lefinite. The way in which a child meets 
situations is more important to modern edu- 
ators, psychologists, mental hygienists, psy- 
hiatrists, and parents than speed in reading, 
the number of words in his vocabulary, or the 
accuracy and speed of learning number com- 
binations. The deadening process of “busy 
vork” is being replaced by freedom to do 
original, creative work. This stimulates the 
child to do things for himself and to share 
with others. 

Today’s collimations in education are defi- 
nitely toward the development of more stable, 
happy and wholesome childhood, which is the 
foundation of hygienic adolescence and ad- 
justed adulthood. The important question 
asked of elementary teachers about their 
preparation is no longer primarily concerned 
with their grades in subject matter, but with 
their attitudes toward child life, family rela- 
tions, and teacher-parent appositions. Can 
she early detect, diagnose, and treat symp- 
toms in conduct cases? Can she distinguish 
the child from his fault? Can she talk unemo- 


"Summary of a thesis for the Ed. D. degree, University of 
Oklahoma, 1936 


tionally to the parent of the ‘problem child’’? 
Does she recognize the right of every child to 
a happy school environment? Does she see 
the individual's need of being understood on 
his own level? 

The child of this new day is being studied 
and treated in laboratories, clinics, hospitals, 
homes, communities, churches, and on the 
streets. Each specialist is taking one phase 
of the child and learning all that is possible 
concerning it. Biochemists are delving into 
the mysteries of the composite child and what 
effects certain chemicals have on his physical 
growth and emotional stability. Some pedia- 
tricians are studying glands and their influ- 
ence on the personality of the child. Dieti- 
tians are analyzing foods and ascertaining 
what each contributes to the health of the 
child. Mental hygienists are learning the 
causes of mental deviation and are preventing 
them from becoming permanent problems. 
Psychologists are studying how the child 
learns and why he behaves as he does. The 
need of these and other experts in the field is 
exigent. However, it is not under these spe- 
cialists, but under the care of a teacher that 
the child is placed for subject matter learn- 
ing and character building. She must be 
trained to synthesize all known facts for the 
wholesome growth of the child. 

The new subject matter of education is the 
child. His personality is so complex that it 
challenges the best teacher. It takes a mini- 
mum of six years for a physician to learn his 
profession and to be able to minister to the 
physical needs of the child. Years of study, 
both in high school and in college, are re- 
quired before one can teach English. The 
teacher is forced to absorb volumes of “sub- 
ject matter,” whereas only a course or two in 
the difficult and intricate subject of child 
learning is required. However, the present 
trends are leading toward the real subject 
matter—the child as a large book of unknown 
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potentialities—which must be studied as no 
other subject has been or is being studied. 

Just how much do teachers know about the 
child’s growth and learning; about guiding 
and developing his personality? There are 
no tests by which to measure teachers’ knowl- 
edge of these things. This is the problem of 
this thesis. Therefore, the first step in this 
study demanded the construction of a compre- 
hensive test in this field of knowledge. An 
analysis will be made of teachers’ scores in 
the terms of a number of factors contributing 
to their knowledge of child development. 
Such factors will include their college courses 
in education, psychology, child development, 
and years of teaching. 


II. Sources AND STRUCTURE OF THE TEST 


Personality factors are somewhat elusive 
and hard to define. Just what is wholesome 
and what is not wholesome in the children of 
early school years is very difficult to deter- 
mine. The relative importance of the differ- 
ent factors has not been determined by re- 
search. The literature in this field is largely 
psychological, psysiological, and philosophi- 
cal, rather than factural, definite, conclusive, 
and statistical. 

To secure a measurement of the teacher’s 
knowledge of personality potentialities and to 
discover how they are developed is a most 
perplexing problem. A valid test should be 
very comprehensive, but short enough to give 
to teachers whose day is so full that it is im- 
practical to use a test that takes more than 
two hours to complete. Laboratory experi- 
ments cannot be used in this type of study 
since it is largely a subject matter test on the 
teacher's knowledge of the whole child. 

The test must sample broadly and measure 
accurately the teacher’s knowledge, which she 
has gained through study, observations, and 
experiences in teaching the young child. It 
must not appear to the teacher as a test, but 
as a questionnaire. This technique of meas- 
uring has the value of more nearly getting the 
teacher's honest reactions to given statements. 
She is not confronted with a rating by her 
supervisor, and neither is she fearful of her 
tenure in the school as determined by her 
frank responses or scores on a test. While 
teachers give many tests, it is well known that 
they do not like to take them. 

In such a questionnaire the answers cannot 
be absolutely objective; therefore, there must 
be a variability of answers expressing degrees 
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of correctness for each situation tested. The 
relative degree of correctness can be deter- 
mined only by pooling the opinion of experts 
in the field. Data from such a questionnaire 
however, may be considered objective as far 
as the teacher’s impartial answers and the in- 
vestigator’s detached manner of scoring are 
concerned. When the scoring is statistically 
interpreted, the investigator is removed from 
subjective elucidations of the teacher’s knowl- 
edges of the child’s personality. 

The objective form chosen for such a ques- 
tionnaire is the multiple choice, which admits 
of a rating by experts, and also of a choice 
in the best answer by the teacher. Thus de- 
grees of correctness in knowledge and in prac- 
tice can be measured. An odd number of 
possible answers or solutions is desired so that 
extremes may be shown and averages ascer- 
tained. Five responses for each question were 
chosen because they offer a sufficient degree of 
variability of knowledge and of practice. 
Bain’ found that five degrees of teaching abil- 
ity could be accurately judged. In this form 
the questionnaire is also a test. It appears to 
the teacher as a questionnaire from which an 
investigator is compiling teachers’ opinions, 
but in reality it is a test of teachers’ knowl- 
edge, sufficiently objective in nature to make 
a reliable measuring instrument for research 

General or average practices, and not the 
treatment of individual cases, form the basis 
of each item to be tested. The statements 
are not grouped in categories, as chance place- 
ment more nearly prevents the teacher from 
being influenced by any perceptible order 
One hundred and thirty-four items were se- 
lected for the test. 

The source material for the elements of this 
test has come from books, magazines,’ lec- 
tures of experts in the field of child develop- 
ment, and courses in child psychology, child 
welfare, experimental psychology, general 
psychology, mental hygiene, abnormal psy- 
chology, educational psychology, parent edu- 
cation, teaching, supervising, and family 
relationships. 

These sources may be considered as a cur- 
ricular validation of the test items. The 
items represent statements of problems ac- 
tually occurring in the writings and lectures 


! Bain, W. E., An Analytical Study of Teaching in Nursery 
Scheols, Kindergarten, and First Grade. New York: Colum- 
bia University, Teachers College, Bureau of Publication, 1928. 

2An extended bibliography of 172 titles, on which this 
article is , is included in the original manuscript of the 
thesis on file in the library of the University of Oklahoma. 
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f our best authorities and in situations that 
the writer knows are real in the lives of many 
children. 


[he final test consists of one hundred and 
thirty-four statements about children’s con- 
duct and personality development. Each 
statement is followed by five possible answers. 
Teachers were asked to select the one answer 
which they thought to be best. 


The Construction of the Scoring Key 

In a test which does not have right and 
wrong answers, but where degrees of correct- 
ness exist, best answers must be determined 
by the pooled judgment of expert judges. 
Nine judges were chosen from the field of 
psychology, teacher training, and child de- 
velopment. The judges are from five differ- 
ent states and represent college professors, 
writers, and supervisors of teacher training 
and home making. 


A mimeographed copy of the Test-Ques- 
tionnaire was sent to each judge with the fol- 
lowing directions: 

Directions: Read each statement and the 
five answers following the statement. Rank 
the answers in order of your preference; i.e., 
number the best answer 1, the next best 2, the 
next 3, the next 4, and the poorest 5. _Write 
the numbers above the answers or in the 
parentheses before the statement. 


The rankings by the nine judges of the five 
responses for each item of the Test-Question- 
naire were summated. These sums were then 
converted into final values for each response 
by assigning five points to the best answer 
and one point to the worst answer, with two, 
three, and four representing the intermediate 
values. Since there were nine judges and five 
responses rated by each judge from one to 
five points, the totals for a perfect agreement 
of judges for the five items would be: first or 
best, 9 points; second, 18 points; third, 27 
points; fourth, 36 points; and fifth, 45 points. 
These total points were converted to final val- 
ues of 5 (best), 4, 3, 2, and 1. 


Perfect agreement of judges was not ex- 
pected in many items. A difference between 
the totals for the successive items must be of a 
certain magnitude before it becomes signifi- 
cant; that is, before it will serve as an ade- 
quate basis for placing the one item above the 
other. A difference which would indicate a 
fair or acceptable degree of agreement should 
be numerically more than half the distance 
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between any two sets of perfect totals. This 
is 4.5 points, but no decimal can be taken as 
a significant figure, as all ratings were in 
whole numbers. Besides, ratings were more 
or less subjective and in many items two re- 
sponses, especially after the best choice, were 
of almost equal value to a judge. Therefore, 
a difference of 4 points was taken as sufficient 
evidence of agreement on near responses. 
Where a difference of less than 4 points ex- 
isted between responses, the two re- 
sponses were considered to be of equal value 
and they received a value which was the aver- 
age of the two values which would have been 
given had they been clearly differentiated in 
the responses. Items not answered and 
marked “x” were scored zero since the direc- 
tions instructed teachers to mark “x” any item 
not known. Some sample items of the test 
with score values are given here; and Table I 
summarizes the ratings for all items. 


close 


Sources of Teachers’ Responses 

To ascertain the knowledge of teachers 
concerning the personality and conduct of 
children, a sampling was made in nine cities 
of three states in the southwest. Two hun- 
dred and fifty-two papers were returned com- 
pletely answered. This is an eighty-two per 
cent response, which is a very high percent- 
age in the light of the usual returns. 


The study had to depend largely upon vol- 
untary response of the teachers to the request 
of the superintendent, principal, or supervisor. 
The questionnaire was long, nine pages of 
multiple choice items, and one page of general 
information relative to the teacher’s training 
and experience. It was a request indeed to 
ask teachers to answer such a long question- 
naire. A shorter one could have been sent to 
a larger number of teachers, but since the first 
object of the sampling was to obtain the sub- 
ject matter of the test, it had to be rather 
inclusive. 


The first page included name and age of the 
person taking the test; present position; years 
of teaching; years of teaching children from 
six to eight years of age; number of college 
years; major study, number of courses in 
child psychology, child welfare, family rela- 
tions, and parent education; number of 
courses in general and educational psychol- 
ogy; and number of courses in general 
education. 
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SAMPLE OF ITEMS FROM THE TEST WITH SCORE VALUES 
2 ( ) The same type of friendly relationships existing among children should exist 
(5,4,3,2,1] between children and teachers. 1 always 2 usually 3 sometimes 4 rarel; 
5 never. 
4 ( ) Criticisms of children by their equals are more effective than those by their 
[4,5,3,2,1] teachers. 1 always 2 most of the time 3 rarely 4 never 5 harmful 
sz i In dealing with children it is essential to find the motive back of the reactior 
(5.4,3,2,1] 1 of great moment 2 expedient 3 of some moment 4 of little importance 


5 of no significance. 


7 To develop individuals who are willing to accept the consequences of the 
[5,4,3,2,1] acts is the best type of teaching. 1 undoubtedly 2 usually 3 sometim: 
4 seldom 5 not at all. 


54. ( ) Compromise between conformity and individualization, should become m 


(5,4,3,2,1] efficient and pleasant for children. l always 2usually 3sometimes 4 rarel; 
64. ( ) While the child is particularly negativistic, small and unimportant issues 
[5,4,3,2,1] should be avoided. 1 yes 2asarule 3 sometimes 4 seldom 65 no. 

66. ( ) The standardized program develops all children equally. 1 certainly 2 ofte: 
[1,2,3,4,5] 8 sometimes 4 rarely 5 not at all. 





_.< 2 Children learn more readily when they are distinguished from their faults 
5,4,3,2,1] 1 always 2 usually 3 sometimes 4 rarely 5 no. 
80. ( ) The happy child tends toward the practice of masturbation. 1 decided); 
[1,2,3,4,5] 2 usually 3 somewhat 4 very seldom 5 not at all. 
oT. «¢ ») Where teachers are fearful of their rating as teachers, the children’s behavior 
(4.5,4,5,3,2,1] responses reflect this anxiety. 1 absolutely 2 usually 3 somewhat 4 very 
little 5 not at all. 
88. ( ) The genetic approach on the part of the teachers insures happier teacher 


[5,4,3 2,1] pupil relationships. 1 undoubtedly 2 very little doubt 3 doubtful 4 slightly 
valuable 5 of no value. 


7 Emotional disturbances cause children to do inferior work. 1 always 2 fr 


97 ( ) 

[4.5,4.5,3,2,1] quently 3 occasionally 4 very rarely 5 never. 

109. ( ) Under difficulty, fear, or anxiety, speech defects become more pronounced. 

[5,4,3,2,1] las arule 2 frequently 3 occasionally 4 rarely 5 no. 

127. ( ) Economic conditions in the home have effect upon the attitude of the children 

4.5,4.5,3,2,1] in school. 1 as a rule 2 frequently 3 occasionally 4 in a few instances 
5 never. 

130. ( ) The essential thing in developing the personality of the young child is to give 

[1,2,3,4,5] him factual information. 1 decidedly 2 to a large degree 3 sometimes 
4 only incidental 5 no. 

131. ( ) Understanding the child as an individual and his relationship to the group is 

[5,4,3,2,1] paramount in successful teaching. 1 yes 2 very important 3 somewhat 


impor.ant 4 of slight importance 5 not essential. 


*In square brackets are the value for the respective responses in order as given in each 
item; that is, the first number is the value for response number 1, the second number is the 
value for response number 2, and so forth. 
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TABLE I 


RATINGS OF NINE JUDGES ON FIVE RESPONSES FOR EACH ITEM OF THE TEST AND FINAL 
VALUE OF RESPONSES FOR EACH ITEM 


Sum of Ratings on Each Response Final value 
. $ 4 5 2 3 ’ 5 
30 15 19 30 41 2.5 5 4 2.5 1 
12 16 26 36 45 o | } 2 l 
13 17 24 36 45 ) 4 ‘ 2 ] 
{ 20) 10 26 35 45 4 >) yA l 
42 22 13 23 D ] 5 ) 2 
33 24 17 22 39 2 ) ) l 
45 6 27 14 ] l z i 4.5 
36 10 17 30 42 2 ) 4 l 
45 36 27 16 11 Z 4 ) 
be] 1S 4 ty 4 { 9 


x 
} 
om 
- > be 
om Ov 
ron 
t 


t 
— i TS 


os 


61 28 14 


4 
l 4: 
l l 4 2 44 2.9 4.5 :.d 2.5 

14 ] 23 2 40 4.5 45 3 72 | 

; 22 i4 20 5 44 ) ) 3.5 > ] 

y 6 13 20 3] 45 3 ) 4 2 l 
ts 29 13 19 35 1.5 ) 5 | 1.5 

2 35 13 18 29 40) 2 5 4 } ] 

52 12 15 33 43 y 4.5 4.5 2 ] 

22 12 20 36 45 5) 5 : yA | 

23 l 19 35 45 4 2 l 

2 23 14 18 28 37 } ) 4 2 l 

2 35 9 18 30 43 2 5 4 } l 

ze 23 18 y a 29 13 5 ) 3.5 2 l 

42 31 17 18 27 ] yA 4.5 4.5 : 

0) 12 1s 32 4:3 2.5 4 2.5 ] 

4] 28 12 20 34 l 5 4 2 

2 45 30 24 12 19 Z } ; 4 
43 3 17 12 30 ] 2.5 4 f 2.5 
} 40 28 19 12 21 ] 2 5 3. 5 
45 5 24 15 16 l Z 4.5 $. G 

15 16 23 36 45 1.5 1 5 2 ] 

12 19 6 33 45 3 4 } Z l 

12 18 17 34 41 5 15 1.5 2 l 
J 42 30 Zo 17 23 ] Z 3.5 5 3. 5 
t 40 2 19 14 15 ] y 1.5 1.5 
4] 45 36 25 15 14 l yA } 4.5 4.5 
42 45 36 27 14 13 l 2 3 4.5 4.3 

j 45 36 27 17 10 ] 2 ; 4 5 

44 36 26 23 17 12 ] 2.5 2.5 4 5 

45 45 36 27 16 11 l Z 3 | 

4¢ 45 29 18 13 3l ] 2.5 4 5 2.5 

47 8 16 24 32 40 5 4 ; yA ] 
4% 44 34 21 15 21 ] Z > 5 5 3 5 
49 45 36 27 15 12 ] 2 4.5 4.5 
50 40 34 24 14 23 l 2 3.5 5 3.5 
43 33 24 14 21 l 2 3. 5 5 3.9 

2 37 20 16 21 4] 2 3.5 5 3.5 l 

3 29 10 17 34 45 3 5 4 2 l 

o4 9 18 27 36 45 5 4 3 2 l 
55 44 35 25 17 14 1 2 3 4.5 4.5 

56 39 27 15 19 35 1 3 5 4 2 
57 45 36 26 15 13 1 2 3 4.5 4.5 

58 16 13 25 6 45 4.5 4.5 3 2 l 

59 15 14 25 36 45 4.5 4.5 3 2 l 

60 13 18 26 33 45 5 4 2 l 

20 3: 3 5 l 

1! : 4.{ l 

° 2 4 


or 

CO & & LO 
on 

citopror 


wor 
w 
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63 45 36 
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TABLE I—Continued 


Sum oF RATINGS OF NINE JUDGES ON FIVE RESPONSES FOR EACH ITEM OF THE TEST AND FINA 
VALUE OF RESPONSES FOR EACH ITEM 





Item Sum of Ratings on Each Response Final value 
] 2 3 5 1 2 3 4 5 

64 9 18 27 36 45 5 4 3 2 l 
65 12 16 26 36 45 5 4 3 2 l 
66 45 36 27 18 9 1 2 3 4 

67 39 24 15 23 34 1 3.5 5 3.5 2 
68 17 17 25 33 43 4.5 4.5 3 2 

69 45 35 23 11 21 1 2 3.5 5 

70 1] 18 25 36 45 5 4 3 2 

71 18 14 22 36 45 4 5 3 2 
72 9 13 21 28 35 5 4 3 2 l 
73 10 17 27 36 45 5 4 3 2 ] 
74 45 36 26 15 14 1 2 3 4.5 4 
75 31 16 18 33 37 2.5 4.5 4.5 2.5 l 
76 30 11 16 33 45 2.5 5 4 2.5 l 
17 42 32 19 15 27 1 2 4 5 

78 10 17 27 36 45 5 4 3 2 l 
79 45 36 27 16 11 1 2 3 4 5 
RO 45 35 26 10 19 1 2 3 4 5 
8] 45 34 20 14 22 1 2 3.5 5 

82 43 29 24 14 25 1 2 3.5 5 5 
&3 33 20 16 27 39 2 4 5 3 l 
R4 45 36 27 17 10 1 2 3 4 5 
R5 44 34 25 19 13 1 2 3 4 5 
86 15 14 25 36 45 4.5 4.5 3 2 l 
87 15 13 26 36 45 4.5 4.5 3 2 l 
RS 12 17 25 36 45 5 4 3 2 l 
R9 41 30 10 21 33 1 2.5 5 4 2.5 
90 13 15 26 36 45 4.5 4.5 3 2 l 
91 ] 18 27 36 45 5 4 3 2 l 
92 41 27 14 20 33 1 3 5 4 2 
93 41 33 22 12 27 1 2 4 5 3 
94 23 13 22 36 41 3.5 5 3.5 2 ] 
95 44 31 17 13 30 1 2.5 4 5 2.5 
96 41 31 20 14 29 1 2.5 4 5 2.5 
97 15 14 25 36 45 4.5 4.5 3 2 l 
98 42 32 22 15 24 1 2 3.5 5 3.5 
99 45 36 24 14 16 1 2 3 4.5 4.5 
100 10 17 27 36 45 5 4 3 2 1 
101 Q 18 27 36 45 5 4 3 2 | 
102 i) 18 27 36 45 5 4 3 2 l 
103 i) 18 27 36 45 5 4 3 2 ] 
104 22 12 22 35 44 3.5 5 3.5 2 1 
105 15 16 25 34 45 4.5 4.5 3 2 l 
106 23 13 25 32 42 3.5 5 3.5 2 ] 
107 17 14 23 36 45 4.5 4.5 3 2 ] 
108 22 13 19 36 45 3.5 5 3.5 2 l 
109 9 18 27 36 45 5 4 3 2 l 
110 17 17 26 33 42 4.5 4.5 3 2 1 
111 28 11 16 35 45 3 5 4 2 l 
112 45 36 27 14 13 1 Z 3 4.5 4.5 
113 20 16 22 32 45 3.5 5 3.5 2 1 
114 27 10 18 35 45 3 5 4 2 1 
115 45 33 20 13 24 1 2 4 5 3 
116 25 11 19 35 45 3 5 4 2 1 
117 19 17 23 32 44 4.5 4.5 3 2 1 
118 43 34 18 13 27 1 2 4 5 3 
119 22 10 19 30 39 3.5 5 3.5 2 1 
120 17 14 24 35 45 4.5 4.5 3 2 1 
121 24 15 20 32 44 3 5 4 2 1 
122 41 28 13 22 31 1 2.5 5 4 2.5 
123 16 14 24 36 45 4.5 4.5 3 2 1 
124 45 36 25 14 15 1 2 3 4.5 4.5 
125 20 11 23 37 44 3.5 5 3.5 1 1 
126 37 19 14 18 32 1 3.5 5 3.5 2 
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IT] ANALYSIS OF TEACHERS’ SCORES 


esponses for each item of the 252 tests 
scored by the key of values or answers 
isly The values of the 
were summed to obtain a total score on 
test. The frequency distribution of the 
scores with the mean, the standard devi- 
of the mean is 


134 


described. 


ind the standard error 
in Table II. 
best unit for comparison of a teacher’s 
gs with the judges’ ratings is an average 
it a total score. Therefore, for each 
total score was divided by the num- 
items, 134, to obtain the mean value 
[The maximum obtainable total score 
| be from a test with perfect agreement 
the key. This would be a 643.2 total 
, or a mean value score of 4.8. The fre- 
y distribution of the mean value score, 
mean, the standard deviation, and 
lard error of the mean value is 
Table III. Hereafter the scores used 
the mean value scores and not the total 
mean value will be 
illed the “scores.” 


fhe 
Lilt 


the 


} 


scores 


he 


res ‘hese scores 


hese distributions show close approxima- 

in to the normal curve for 252 cases. This 
s an index of both validity and sampling. If 
the distribution of scores from a large unse- 
lected group shows close approximation to a 
normal curve, it is evidence of discrimination 

abilities which may be assumed to exist 
naturally. Discrimination is an index of 
validity and reliability, validity being inferred 
from approximation to normal distribution, 
ind reliability being inferred from the wide 
range of total scores (from 425 to 580). The 
difficulties of the test items are well distrib- 
ted if the distribution of scores approximates 

rmality for a group assumed as being nor- 
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mal. Monroe‘ says, “This index of validity 
refers to the differentiation for 
pupils possessing different degrees of ability. 
It is obvious that any lack of objectivity or 
reliability will result in a lack of discrimina- 
tion for certain pupils.”” Monroe and Engel- 

* Monroe, Walter S., An Introduction to the Theory of Edu- 
cational Measurements, Houghton Mifflin Company, 1923, p. 


iy 


of scores 
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SCORES 


MEAN 


Oo 
oo 


01 


\ valid test reveals differences in 
exist and one that fails to reveal 
which is at all marked is distinctly 

Validity may be in- 
presence of curricular 


this qu lity. 
the joint 
it nd high reliability. 

[he reliability of the test was established 
by correlating even with odd numbered items 
and correcting by the Brown—Spearman 
Prophecy Formula. The correlation (7.0) is 
reliability coefficient (7,,) .922 
error of a raw score, 
(o\ .) 8.38. The high reliability, the 
probable high discrimination, and the validity 
of the test make it a valuable measuring in- 
strument for determining the true knowledge 
teachers have of the character and personality 
of children from six to eight years of age. 

Certain facts about each teacher’s training 
and experience were obtained by a short ques- 
tionnaire attached to the first page of the test. 
lwo methods of interpreting the meaning of 
scores in the light of such contributing factors 
The first is the method of cor- 
relation analysis. This method presents dif- 
ficulties when correlations are low. From ob- 
servation and tabulation of scores and con- 
tributing factors the corrections appear to be 
low. Odell® shows that when correlations are 
low, interpretations of the coefficient “. 
through approach to or departure from per- 
fect correlation” are not valuable nor accur- 
ate. This method will not be used. The sec- 


* Monroe, Walter S. and Engelhart, Max D., The Scientific 
Stud y Educational Problems, The Macmillan Company, 
1936, p. 181 
* Odell, Charles W., Statistical Methods in Education, D 

vieton-Century Company, 1935, pp. 192-199 


I 
854 12 


standard 


I } 


are possible. 


A 


{T/ON 


ond method compares the mean score 
high group, the mean score of a middle gr 
and the mean score of a low group. If the 
differences of means are slight, correlations 
would be low. The upper quarter will be 
taken as the high group, the middle half wil 
be taken as the middle group, and the lower 
quarter will be taken as the low group. Each 
of the factors which might influence s 
will be analyzed by comparisons of the means 
of the three groups. 

Normally, one would expect the number 
years in college to have a positive effect 
teachers’ knowledge of children’s personality 
and conduct. The range of years in colleg 
was divided into three groups and the mea: 
score of each of the three groups calculate 
The data are presented in Table IV. There 
is a slight increase in from the low 
group to the high group. The difference be- 
tween the lower quarter and middle half 
.o2, and the difference between the middle 
half and the upper quarter is .18. One dif- 
ference is not significant, the other is signifi- 
cant. When these differences are converted 
into total scores they are, respectively, 2.58 
and 24.12 (.02 X 134 = 2.68 and .18 X 134 

24.12). Three points difference in a test 
where the maximum total score is 643.5 
points, is not significant. However, 24 points 
are 4/5 (2497.7) of a sigma and are sig 
nificant. The usual technique of statistical 
significance of difference of means is not ap- 
plied here as the purpose of these data is not 
primarily to generalize such differences for 
the total population, but to show differences 


scores 


TABLE IV 
MEAN ScoRE OF TEACHERS IN THE UPPER 
QUARTER, MIDDLE HALF, AND LOWER QUARTER 
OF YEARS IN COLLEGE 
Number of Number of Mean 
Years Teachers Score 
5.0-6.0 29 4.06 


3.0—-4.5 195 3.88 
3.86 


in the groups compared. The overlapping of 
the total distributions of the group is very 
large. In fact every group has cases of ex- 
tremely low and extremely high scores. Lin- 
coln’ has shown that the significance, by the 
usual method of significance of differences of 
means, is often exaggerated when viewed by 


* Lincoln, Edward A., “The Insignificance of Significant Dif- 
ferences,’’ Journal of Experimental Education, March, 1934 
pp. 288-290 











r >> 
1938] 


verlapping of total distributions. These 
were considered and the usual inter- 


tors 
jtions of significance rejected as not con- 
ting to an understanding of the data. 


hese small differences of means indicate very 
w correlations, and support the decision to 
reject the method of correlation analysis. 


i 


Courses in psychology and education should 
uppreciable positive effects on scores on 

e test. The data of the three groups, in 
rms of the 
fable V. The differences here are also slight. 
[he difference from the low group to the mid- 
group is .o1, the difference from the mid- 

e group to the high group is .08, and the 
ferences are not in favor of amount of 
ning. It seems that after a teacher has 

i a number of courses there is no improve- 
ent in score by the addition of courses. 
Perhaps, semester hours would have been a 
lore accurate measure of training, but such 
uracy would not have given more signifi- 


t differences. 


number of courses, are given in 


TABLE V 


[EAN SCORE OF TEACHERS IN THE UPPER QUAR- 
TER MipDLE HALF, AND LOWER QUARTER OF 
COURSES IN PSYCHOLOGY AND EDUCATION 


Number of Number of Mean 
Courses Teachers Score 
26-33 i acs toa a 5 3.81 
STD voxdcgc anmendad ane 169 3.39 
0-7 RE anel Sa ee a 72 3.90 


If courses in psychology and education do 
not show positive appreciable effects on the 
test, perhaps courses in child development do. 
The data for these courses are in Table VI, 
ind show only slight differences. The differ- 
ence between the low group and the middle 
No difference between the mid- 
dle and the high group is evident. These 
lata raise a serious question concerning the 
subject matter of these courses. 


group is .o8. 


TABLE VI 


MEAN SCORE OF TEACHERS IN THE UPPER QUAR- 
TER MIDDLE HALF, AND LOWER QUARTER OF 
COURSES IN CHILD WELFARE 


Number of Number of Mean 

Courses Teachers Score 
I a ai 13 3.92 
6 ae ee es 184 3.92 
ce ee ee 41 3.84 
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If courses in college do not correlate with 
scores on this test, perhaps experience does. 
Table VII gives data showing the means of 
three groups based upon the number of years 
teaching. The middle group, teachers who 
have taught from 12 to 31 years, make the 
highest score, though it is only .o3 points 
higher than teachers who have taught from 
2 to Ir years, and only .o3 points higher than 
teachers who have taught more than 31 years. 
There are only 9 teachers in the 32—42 group 
as few teachers in public schools have taught 
more than 31 years. 

TABLE VII 
MEAN SCORE OF TEACHERS IN THE UPPER QUAR- 


TER MIDDLE HALF, AND LOWER QUARTER OF 
YEARS TEACHING 


Number of Number of Mean 
Years Teachers Score 
See ee 9 3.80 
12-31 ET ee 3.87 

2-11  — = 7 91 3.84 


The test measures teachers’ knowledge of 
young children, therefore teaching young chil- 
dren should indicate more knowledge or higher 
scores on the test. Table VIII shows that 
teachers who have taught from 12 to 31 years 
make .o6 points more than teachers who have 
taught from 2 to 11 years, and .o4 points 
more than those who have taught from 32 to 
42 years. These differences are not large. 


TABLE VIII 


MEAN SCORE OF TEACHERS IN THE UPPER QUAR- 
TER MIDDLE HALF, AND LOWER QUARTER OF 
YEARS IN TEACHING YOUNG CHILDREN 


Number of Number of Mean 
Years Teachers Score 
32-42 —_- " ie eiedde 9 3.87 
12-31 ce eee . 112 3.91 

a li ca a ea el 127 3.85 


Do teachers who major in education make 
higher scores on this test than teachers who 
major in other departments? They make 
slightly higher scores as is shown in Table IX. 
The difference is .o4 in favor of education 
majors. 

TABLE IX 


MEAN SCORE OF EDUCATION MAJORS AND MEAN 
Score OF ALL OTHER MAJORS 


Number of Mean 

Record Teachers Score 
Educational Majors _.__._.... 151 3.83 
Ge EE ccawanccnudes 91 3.79 
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de 


found which 
test. Differ- 


In general no factors were 
seriously affect scores on the 
ences though small do exist in a number of 
factors. However, they might have consid- 
erable effect in teaching. It is difficult to in- 
terpret the real meaning of these differences. 
Statistical differences may not answer the 
questions involved. Sampling may have 
played an important role. These teachers 
were undoubtedly a select group. They teach 
in the best schools in this section of the coun- 
try and were selected by their administra- 
tive officers to assist in this study. This 
makes the range of scores limited and differ- 
ences not statistically significant. Intelli- 
gence per se was not measured. It may be 
that the determining factor, correlated with 
test scores, is intelligence. If grades from all 
colleges could have been secured and reduced 
to comparable units, they might have revealed 
significant differences. Further investigations 
may yield significant relationships. 

In order to determine how much teachers 
knew about each item of the test, all teachers 
scoring 4 or more were credited with consid- 
erable knowledge and all teachers scoring 3.5 
or below were credited with little knowledge. 
Of the five possible responses to each item 
only the first two choices really represent 
worthwhile or desirable solutions. These 
credit values include 5, 4.5, and 4. Chance 
responses would probably yield an average 
rating of 3. 

On close examination of the items best 
known or missed, the investigator found no 
specific grouping of a particular area of sub- 
ject matter known best or least known by the 
teachers who assisted in the study. This is 
not unlike knowledge in other fields of sub- 
ject matter. 

j ctical use of a test or scale requires 
norms. A sampling of 252 cases in better 
schools furnishes tentative norms. Two 
types of norms are given in Table X. One of 
the oldest ideas of expressing efficiency is the 
percentage scale. The maximum score on the 
test is 634.5. If this is divided into each 
score and multiplied by roo, it will give « 
percentage score. A second and more recent 
norm is the percentile score. A percentile 
score is a statement of a score in terms of a 
relative or percentile position in the distribu- 
tion of the whole group. Any percentile for 
a score shows the percentage of cases making 
this score or less. Zero is the lowest per- 
centile and 100 is the highest. It is widely 


Lal 
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used and is one of the most useful norms for 
tests in high school, college, and vocational 
groups. 


IV. CONCLUSIONS 


Before teachers’ knowledge of the conduct 
and personality of children could be ascer- 
tained, a test had to be constructed which 
would measure objectively such knowledg: 
The objective type chosen is the multiple 
choice form, with five possible choices for 
each item of the test. A key of answers to 
the 134 items of the test was derived from 
the pooled opinion of nine experts in the 
fields of education, psychology, and child 
development. 

After examining the distribution of scores 
of the 252 teachers from nine of the better 


TABLE X 


PERCENTILE SCORE AND PERCENTAGE SCORE" FOR 
THE TOTAL SCORES ON THE TEST 


Tota! Percentile Percentag: 

Score Frequency score scor 
580-584 _ 1 99.6 92.0 
575-570 4 99.2 91.3 
570-574 —- 1 97.6 90.5 
565-569 __ 7 97.2 89.7 
560-564 —_- § 94.4 88.9 
555-559 9 91.3 88.1 
550-554 _.. 11 87.7 87 
§45—-549 13 83.7 86.5 
540-544 __ 15 78.6 85.7 
535-539 ... 13 72.6 84.9 
530-534 _- 15 67.5 84.2 
525-529 _.. 23 61.5 83.4 
520-524 12 52.4 82.6 
515-519 _.. 16 47.6 71.8 
510-514 _.. 14 41.3 $1.0 
505-509 __. 16 35.7 80.2 
500-504 _.. 11 29.4 79.4 
195-499 _._ 11 25.0 77.6 
490-494 _.. 18 20.6 77.9 
485-489 __ 3 13.5 77.0 
480-484 _.. 10 12.3 76.3 
475-479 __- 3 8.3 75.5 
470-474 __- 2 7.1 44.7 
465-469 —_ 5 6.3 74.0 
160-464 ___ 5 4.4 73.1 
455-459 —_ _ 0 3.6 72.3 
450-454 ___ 1 2.4 71.6 
445-449 ___ 2 2.0 70.7 
440-444 ___ 0 1.6 70.0 
435-439 — _- 0 1.6 69.2 
430—434 __- 0 1.6 68.4 
425-429 __. 3 1.3 67.6 
‘| 252 


"Percentage score is the maximum score 
from the judges pooled opinion into each total 
score. As total scores are grouped, the upper 
limit of each step interval is taken as the 
numerator of the percentage fraction. 
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schools of three states of the Southwest, the 
test was found to be diagnostic because of its 
1igh discrimination, high curricular validity, 
ind high reliability. 

Some pertinent questions were raised from 
facts that were revealed by the test. The 
umber of years in college is significant. The 
type of courses taken and the number of years 
teaching show only a slight positive influence 
n the scores made on the test. 

Colleges are, perhaps, not making student 
teachers cognizant of the child as the dynamic 
ubject matter in education and child psy- 

logy. Colleges are, perhaps, not teaching 
potential teachers to recognize and diagnose 
the conduct and personality of children. Col- 
leges are, perhaps, failing to inculcate teach- 
¢ knowledge and techniques in terms of the 
whole child. 

[he investigator found no specific grouping 
f a particular area of subject matter known 
best or least by the teachers who assisted in 
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this study. The usual wide range of indi- 
vidual differences found in other tests was dis- 
closed in this test. This is not unlike knowl- 
edge in other fields of subject matter. Since 
no data were found which accounts for these 
variations, the investigator is lead to believe 
that a high degree of intelligence is a prime 
factor in transferring college training and 
teaching experiences into integrated knowl- 
edge of the conduct and personality of chil- 
dren from six to eight years of age. The 
kinds of information measured by this test 
certainly measure integrated knowledge. 

The test is inclusive in what it purports to 
measure; therefore it might be used advan- 
tageously in teacher training institutions. The 
practical use of a test or scale requires norms. 
Tentative norms in the forms of percentiles 
and percentages are present. They may be 
used by professors of child psychology and 
child development and by school executives 
in their selection of teachers of young children 





THE NATURE OF THE ABILITIES REQUIRED IN THE SURVEY 
COURSES OF THE CHICAGO CITY JUNIOR COLLEGES* 


Max D. ENGELHART 


Department of Examinations 
Chicago City Junior Colleges 


Introductory Statement 

\ chief objective of the Chicago Junior 
Colleges is to provide the means whereby all 
students can acquire a general education; 
hence, the curriculum of the colleges includes 
survey courses in the following fields: Eng- 
lish composition, the humanities, biological 
science, physical science, and social science. 
The typical freshman enrolls in English com- 
position, the first year of social science, bio- 
logical or physical science, and a number of 
elective courses. The typical sophomore en- 
rolls in the second f social science,’ 
the humanities, biological or physical science, 
These sur- 


year of! 


and a number of elective courses. 


vey courses are taught largely by means of 


lectures. One class hour per week in each is 
devoted to discussion, recitation, and testing. 
Achievement is measured at the close of the 
year by means of three-hour comprehensive 
examinations prepared by the Department of 
Examinations with the cooperation of the 
faculties concerned. The mark for the year 
of study is wholly dependent on the degree of 
attainment of the student on the compre- 
hensive examination. 


The Problem of This Investigation 

It may be assumed that recognition of the 
attainment of general education as a chief 
bijective of the city junior 
in itself a restriction on the nature of the abil- 
ities required for successful achievement in 
the survey courses. Acceptance of this ob- 
jective implies that the curriculum materials, 
methods of instruction, and methods of meas- 
urement shall not necessitate the utilization 
of special talents by the students. In other 
ichievement in each of the surveys 
should be largely dependent on capacities 
which all students are likely to have, or abili- 
* Mr. Hugh B. Lewis, a graduate student of the University 


{ Chicago, assisted the writer in the statistical treatment of 


the data 


colleges specifies 


, 
words, 


ange in p has reduced the requirement ir 
» from two years to one year. The data used in 
lected while the two-year requirement was 


ties which all students are likely to acquire 
in varying amount. The student of averags 
general competence and, hence, capable of s 
curing satisfactory marks in some of the sur- 
vey courses, should not fail to secure a satis- 
factory mark in any one of the survey courses 
because of a need of unique abilities which he 
does not possess. 

It is the purpose of this study to determin 
the extent to which the abilities required 
the survey courses are general and the extent 
to which they are unique to each. It is not 
the problem of the investigation to identify 
or to name the abilities so classified. Some 
suggestions may be made, however, regarding 
possible names of these traits, with the qual- 
ification that these efforts at identification are 
not based on the quantitative data reported. 

It is sufficiently important to establish tl 
truth of the hypothesis that the individua 
differences in achievement are largely due t 
abilities common to all of the survey courses 
Such a conclusion has significant application 
in efforts to solve the practical problems 
curriculum construction, methods of instruc- 
tion, and personnel work in the colleges. 


The Sources of Data 

The original data used in this study are 
scores on the six comprehensive examinations 
and on an intelligence test, the Psychological 
Examination of the American Council on Edu 
cation. Two samples of data were used. 
The first sample consisted of the scores on 
each of the six comprehensive examinations of 
100 students who had taken all six of the ex- 
aminations. The second sample consisted of 
the scores of another group of roo students on 
each of the six comprehensive examinations 
and on the intelligence test. All three col- 
leges were represented in each of the samples. 
Both samples were chosen in a random fash- 
ion. The only departure from random selec- 
tion was the rejection of cases where the data 
were incomplete. 
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Techniques Used 
[he techniques used in analyzing the data 
be described in later paragraphs. It 
be mentioned here that the original data 
re used in the calculation of Pearson prod- 
t-moment coefficients of correlation between 
possible series of paired scores for both of 
e samples referred to above. Fifteen coeffi- 
of correlation were calculated from the 
rst sample of original data. Twenty-one 
efficients of correlation were calculated from 
he second sample of original data. Both sets 
were subjected to two methods of analysis in 
rts to determine the extent to which the 
ties underlying the correlations are gen- 
and the extent to which they are unique 
the survey courses concerned. It was felt 
jat the use of two independent samples and 
two methods of analysis would more ade- 
quately support the conclusions derived from 

the study. 


tions of Data and Techniques 
is probable that the comprehensive exam- 
ns do not measure all of the desirable 
ges made in students as a result of in- 
truction in the survey courses. The attain- 
nt of a general education should mean that 
students have acquired not only knowl- 
ge and the ability to apply knowledge in the 
‘vey fields, but also attitudes, ideals, and 
The comprehensive examinations 
easure effectively the extent to which knowl- 
edge has been acquired. They measure some- 
what less effectively and somewhat problemat- 
lly the extent to which the students are 
ble to use knowledge in their thinking in 
he fields concerned. They do not measure 
rectly the extent to which attitudes, ideals, 
interests have been engendered. There 
is, of course, the possibility of indirect meas- 
irement because of the correlation of abili- 
ties. If the measurement is restricted to 
knowledge, and the students who have knowl- 
edge in greater amount also have the less tan- 
gible traits in greater amount than the stu- 
dents of lesser knowledge, then the tests meas- 
ire the intangible traits indirectly. It may 
be that the scores on the tests are in part a 
result of the possession of these traits in vary- 
ng degree. It may be that the general abil- 
y, or abilities, later measured but not identi- 
hed from the data are inclusive of these in- 
directly measured traits. It is possible that 
the unique abilities are in part these less tan- 
gible, but decidedly desirable, traits. This 


terests 
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limitation does not, however, significantly re- 
duce the dependability of the conclusions de- 
rived from the quantitative data, alth 
does restrict the dependability of 


ugh it 
more 


speculative inferences concerning the 


[he samples of 100 students each are some 
what limited in size Che use of two samples 
however, represents an effort to meet a pos- 
sible criticism of inadequacy of data from the 
standpoint of sampling. The examinations 
are not perfectly reliable instruments, but re- 
liability determinations, not here, 
have indicated comparatively high reliability 
The coefficients of reliability usually obtained 
for the comprehensive examinations are above 
.9O and, in some cases, as high as .g7. In 
given series of scores on the comprehensive 
examinations, the scores do not all refer to the 
same edition of the examination. However, 
the fact that the derived scores were used in 
calculation meets this limitation in large meas 
ure. Departures of scores from perfect com- 
parability would probably tend to reduce the 
sizes of the coefficients of correlation obtained 
The indices of the general abilities would be 
reduced in size on this account. These in- 
dices are so large, however, that this data 
fault cannot be regarded much 
significance. 


reported 


as ol! 


The necessity for using data complete with 
respect to six examinations for each student 
means that the samples are not altogether 
representative of the college populations. Se- 
lection has operated, causing the groups to be 
somewhat more competent than the student 
body as a whole. However, the operation of 
such a factor of selection should have tended 
to reduce the correlations, since coefficients 
obtained from restricted ranges of talent are 
usually smaller than those obtained from 
wider ranges. The fact that the correlations 
between the scores on the comprehensive ex- 
aminations are high in spite of this limitation 
indicates that it is possibly not a serious one. 
It may be that possession of the general abil- 
ity is itself a factor in persistence in attend- 
ance for two years of junior-college instruc- 
tion, or conversely, that the general ability is, 
in part, the trait “perseverance.” 


The techniques used in analysis of the data 
were devised in efforts to identify and meas- 


ure human abilities. They are not appropri- 
ate as means of identification of abilities when 
the traits measured are very complex in na- 
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ture ious that the comprehensive 


examinations and the intelligence test are com- 


plex in that they measure varieties of traits. 
However, as has been indicated in the defini- 
tion of the problem of this study, the purpose 
identify the general and unique abili- 
n the data of this study. The prob- 
determine the extent to which abili- 
ties are common to the survey courses, and 
the extent to which they are unique. Any in- 
erences respecting the nature of the abilities 
are admittedly speculative. The inferences 
are made only because they may serve to stim- 
ulate further investigations in this field, and 
investigations in which the variables analyzed 
in character. This is in a 
study an exploratory in- 


is not tl 
ties fror 


lem is t 


mplex 


pioneer 


ition 


mm or fhe 


{nalytical Techniques 

that, when two phenomena 
one of the phenomena is 
or they are both due 


[he principle 
vary concomitantly, 
the cause of the other, 
to a cause, was expressed by John 
Stuart Mill® several before the idea 
of correlation and its measurement by means 
of a coefficient was developed by Sir Francis 
Galton.* The calculation of coefficients of 
correlation was made more precise by the de- 
velopment of more rigorous mathematical 
techniques by Karl Pearson His formula 


\ 


, " : “4 is used in 
matically equivalent form in the calculation 
of the correlation coefficients analyzed in this 
study. Pearson and Yule may be credited 
with the development of partial and multiple 
although other statisticians have 
knowledge of the theory and 
have devised more economical methods of 
calculation. In recent years much use has 
been made of these techniques in efforts to 
determine the relative and combined contribu- 
tions of several variables considered as causes, 
to another variable considered to be the effect 
of the given variables. It has been shown 
that these techniques have certain serious 


common 


yeal 5 


a modified but mathe- 


correlation, 


increased our 


irrent M Factorial Meth 
2, March, 1937 

Mill. J. S. A System Logic. New York: Longmans 
Green 1d Company, 1906, p. 263. This volume was first 
published in 1843 

‘A t ‘ Set Walk He M 
History of Statistical Method Baltimore 
Wilkins Company, 1931, p. 103-106 

* Pearson, K Mathematica yntributions to the Theory 
f Evolutio n—Hll Regression, Heredity, and Panmixia,”’ P4Ail- 
osophical Transactions, A, 187:253-318, 1896 

Walk t p. il 
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limitations. It has been shown that thy 
variables eliminated in partial correlati 
should not be indirectly related, through cor 
relation with other variables, to the variables 
for which the net relationship is soug! 
When this condition is not met, the coeffi- 
cients of partial correlation obtained are 
precise measures of the net relationship 
tween two variables when one or more other 
variables are held constant or are eliminat: 
Similarly, the coefficient of multiple corre 
tion does not measure with precision the 

to which a given variable is affected 
several related variables acting in combi: 
tion. Only under unusual circumstances 
the ordinary correlation techniques accomplis 
these theoretical purposes. 


gree 


More adequate techniques have been de 
vised by Spearman, Kelley, Holzinger, and 
Thurstone. Spearman® observed that wher 
correlation coefficients representing all of t! 
intercorrelations between a number of vari 
ables are arranged in a table in order of ce 
creasing average magnitude by rows, there us 
ually occurs a decrease in magnitude of the 
coefficients from left to right. He concluded 
that this condition indicates the presence of a 
general factor underlying the variables. He 
later advocated the use of the tetrad equation 
as a means of demonstrating whether or not 
the correlations between variables are to be 
ascribed to a single common factor. The fol- 
lowing are examples of tetrad equations: 


10% 34 — "137% 24 = O- 
12% 34 — "147 23 2 
13724 — "i4l23 = O.- 

When ordinary coefficients of correlation 
are substituted and the equations equal zero 
within the limits of the probable error, the 
explanation is that a single factor is common 
to the variables concerned. The following is 
a demonstration of how a single factor can 
cause a tetrad equation to vanish, or equal 
zero. Let us assume that the correlation be- 
tween variables X, and X, is entirely due to 
a common factor g. Then, if the effect of g 
is removed, the correlation should equal zero 


* Burks, B. S On the Inadequacy of the Partia 
Multiple Correlation Technique,’ Journal of Educational Psy- 
chology. 17:532-40, 625-30, November, December, 1926. 

Engelhart, M. D. ‘‘The Technique ‘of Path Coefficients, 
Psychometrika, 1:287-293, December, 1936. 

Monroe, W. S. and Engelhart, M. D. The Scientific Stud) 
of Educational Problems. New York: The Macmillan Com- 
pany, 1936, p. 366-400. 

* Spearman, C “General Intelligence 
mined and Measured,”” American Journal of 
15:201-293, 1904 
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ry partial correlation formula may 


represents the correlation between 
rhen, from the 


12 18° 28 
milarly 
"34 Te! sa 
’ PaPen 
? 4 r. Tse 
stituting in the first tetrad equation, 
vol ae! 4 TV acV onl se 2) 
single common factor is sufficient 
the vanishing of a tetrad Con- 
ely, the discovery of the fact that numer- 
rad equations vanish within the limits 
bable error has been offered in support 


hy 


spearman 


ry that most intellectual traits are 


a common factor plus certain specific 
. = 


he specific factor is unique to a 
and, hence, is not involved in a 
coefficient or in the tetrad equa- 


also derived a formula for 


determining the extent to which a given 


LU 


tor 


relates with the hypothetical com- 


lactor. 


se ol t 
irre 


tradqs a 


he tetrad equations revealed that, 


ations between certain traits, the 
not vanish. The explanation of- 


ered was to the effect that factors other than 
the single common factor are present, that is, 


rs not 
t 
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nerauty. 


unique to each of the variables nor 
all of them, but of some degree oi 
Techniques were devised for use 


g these group factors.’ 
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theoretically cause a tetrad equation to vanish, but 
ence of such combinations seems improbable. 

Kelley Be See 
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Crossroads in the Mind of Man. 
Stanford University Press, 1928. 
le f the use e Cairns. George | 

»f Mathematical Abilities,’”’ Catholic Univer- 
Educational Research Monographs, Vol. 6, 
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Washington Catholic Education Press, April, 1931. 
é rvard Studies in Education. Vol. 26. Cam- 
Massachusetts: Harvard University Press, 1935. 146p 
Ka J Preliminar Report n Spearman 
Trait Stud ( ig Department of I 
n, University of Chicago, 1934, 1935, and 1936. (Re- 
Harold Analysis of a Complex of Statisti- 
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nto Principal Components,” 
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Churstone lave attacked the problem of 
analyzing correlations in an effort to deter- 
mine the underlying factors. The research of 
rhurstone has led to procedures which seem 
most effective in solving the problems. By 
means of his “centroid” method, one can as- 
certain how many basic factors underly the 
intercorrelations of a number of traits. If 
only a single factor is present, his procedure 
gives the same results as the procedure of 
Spearman. If, however, several common fac- 
tors are factors of varying de- 
gives the 
variables with 
the solution thus ob- 


present Lf 
trees of generality—the procedure 
correlations of the original 
T 

However. 
Other solutions can be 


mathematical tech- 


these factors. 
tained is not unique. 
obtained; hence, further 
niques are used in order to secure psychologi- 
cally meaningful results. 

The centroid method was used in this study 
with the intercorrelations calculated from the 
com] data.’ Tr! 
writer did not feel that his data justified the 
continuation of the analysis beyond the iden- 
tification of a single common factor. As will 
be indicated later, it is probable that minor 
factors common to two or more of the vari- 
ables are present, particularly in the situation 
where intelligence is included as one of the 
original traits. The possible existence of 
these factors not limit the conclusion 
that achievement in any given survey course 
is largely due to ability or abilities which 
function in all of the surveys 

In a preceding paragraph it was shown that 
the correlation between two traits is equal to 
the product of the correlations between each 
of the traits and a single common factor, i.e., 
Where there are several com- 


renensive examination 


does 


r, Rol 
mon factors 1, 2, 3, 4 r, any correla- 
tion r;, may be expressed,’ 

lik @;,2yx, + Aso 

Dj,Qyug + 2. + Ay (1) 


Where a,, refers to the correlation between 
trait j and the hypothetical common factor, 1, 
a,, refers to the correlation between trait & 
Ann “Arby r, Michigan 


Sp 
Thurstone, L. L 


1 Vu t / tor 
Edwards Brothers, Inc., June, 1932. 


1 Simplified Multiple Factor Method and 


an Outline of the Computations. Ann Arbor, Michigan: Ed- 
wards Brothers, Inc., 1933, 26 p Abe 

Thurstone, L. L. The Vectors of Mind. Chicago: Univer- 
sity of Chicago Press, 1935. 266 p 
single common factor was used as a check. 

See Tl st I I Th } tor 

University of Chicago Press, 1935. p. 92. 

The derivation given here in explanation of the centroid 


method closely follows that given by Thurstone 
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ind the factor 1. Similarly, a;, and a, refer 

the correlations with a second factor. The 
symbols a,, and a, have the same meaning 
for the last, or r“, factor common to the tests. 
Ihese factors, while correlated with the traits 
jand &, are themselves uncorrelated. A simi- 
lar equation may be written in symbol form 

r each of the correlations in a table of all of 
the intercorrelations between the given tests. 
lhe subscript j stands for any row, and k 
stands for any column; since such tables are 
symmetrical, they refer successively to the 
same traits and in the same order. 


Geometrically, the numerical values of the 
a’s depend on the locations of orthogonal ref- 
erence vectors, since each a can be represented 
as the projection of a vector which stands as 
a trait, or test variable, on the vector which 
represents the given common factor, which 
may be termed an “orthogonal” reference 
vector. The term “orthogonal” means that 
the reference, or factor, vectors are at right 
ingles to each other—a configuration which 
denotes their lack of intercorrelation. There 
are as many of these reference vectors as 
there are common factors. If there are two 
such factors, the geometric representation is a 
plane; if there are three factors, the repre- 
sentation is in three dimensions; and if there 
are more than three factors, there are more 
than three dimensions, and the geometry is 
that of hyperspace. 

Che centroid method yields for each trait, 
or test variable, a series of factor loadings. 
‘hese loadings are the a’s, that is, the correla- 
tions of each test with the first factor, with 
the second factor, and so on. Since, however, 
the values of the a’s depend upon the locations 
in space of the reference vectors which repre- 
sent the common factors, and because the cen- 
troid method does not yield a unique solution 
in problems where more than one factor is in- 
volved, it is usually necessary to “rotate” the 
vectors obtained from a centroid 
solution. This is done to secure results which 
have psychological meaning. Rotation of the 
reference vectors or coordinate axes of the sys- 
tem, does not affect the values which may be 
calculated for r In other words, the cen- 
troid factor loadings, or the loadings secured 
with reference to new coordinate axes, may be 
substituted in the equation given for 7;, and 
the original table of correlations reproduced— 
an excellent check on the method. 


reference 
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Let 
at Q" 5,0" + 2520’ no 4 
@’ 530’ xs a i Bi a @" 570" x 
where the primes refer to loadings obtained 
from the centroid solution. 

The vectors may be visualized as originat- 
ing in a point, the origin. The reference ve 
tor representing the first common factor 
passes through the origin also and through th 
centroid or, in terms of physics, the center o/ 
gravity of the points which terminate the test 
vectors. When this is the case, the projec- 
tions on the reference vector are a maximun 
and the reference vector has zero projections 
on the rest of the orthogonal reference vectors 
representing other factors. The equation just 
given may be summed for all tests ) 
umn & of the correlation table: 


and summing for all columns of the correla 
tion table: 


n 
N 


, 


, 
Sa’, a + 6 « at wna 


t 
Since the correlation table is symmetric, 
each row has its corresponding column 


, , 
Sa" x: >” ae 


n 


1 


n 2 
7 iy Xa’ a 
k &¢ 1 j ’ 
[ Sa’ ] o 2 «© % -[ >a’ | 
j 1 : 


The co-ordinate of the centroid of the termini 
of the test vectors, measured along the ref- 
erence vector representing common factor 1, 
is equal to 


where is the number of tests. 

One can assume that the system has been 
so rotated that the centroid lies in the first 
axis of reference, the vector representing the 
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wr. This centroid has zer 


tions on the other orthogonal referenc« 


or axes. 


, 


Na 


Hence 


17 


tituting in equation (¢ 


X\ Ny 


nicn 7; 








(7) 


(3) 


equals the summation of all the 


{GO 3,23 


One can then remove from each of the orig 
inal coefficients of correlation the part of the 
correlation which is due to the first common 
factor and thus obtain the residual correla- 


tion, i.e., that part of the correlation which 
is due to other common factors. From equa- 
tion (2 

a’ 2” ‘ a’ a’; {I2) 


where 7,.;, represents a given residual corre- 
lation corresponding to the original correla- 


ients of correlation in the table includ- tion 7;, and a@’;, represents the correlation of 
diagonal terms.*” trait j with ¢ n factor 1 and a’,, repre- 
: sents the correlation of trait & with common 
bstituting (7) in (3), the summation =a 
eee .* : : factor his has been done in this study 
n coefficients in a given column ; + eer . 
From the residual correlations, DY a process 
Sr Sa’ (o) identical with that expressed by equation 
~ a = ( ’ Q . . ’ rs eae 
, (12), the correlation of each test with a se 
ond factor can be obtained. However, since 
5) the centroid previously mentioned has zer 
projections on the second, third, and other 
Sa’ r, (ro) reference vectors, it is at the origin in the 
(7 1) subspace** and must be removed 
bstituti . from this origin. This is accomplished by a 
e, substituting in (« , . . 
P process described by Thurstone, and need not 
7 be explained here.’ lhe process of extract- 
Ss) 4 ’ r (rr) : « ae bith od 
“ ing factors is repeated until the residual corre 
lations are negligible in size. 
presents the sun f the coefficie nt 
he F S »/ 
ven coh Hence The First Sample 
In Table I are given the coefficients of cor- 
. (r2) relation calculated from the first sample of 
, data 
; , , Through use of the process represented by 
ne determine for each test, rep : t} - < = lati f 
. : ; equatio 2) > Tollowing orretations ¢ 
ted lur in the correlation tab] qu mie 7 - . I ( Bh 
ee gape ce : acl » tests with the first common facto 
rrelation of that test with the first escecdpetel 7 esr 
2 : . ‘ were calculated 
ion factor. since r, can be obtained si 
ing the entire table and 7, can be ot English 101-102 + 
y ° ° Social Scien 107 ih R 
ed by summing the given Ccé lur ! » Clal sel nce 1 = = 4 
. Social Science 201-202 50 
Humanities 201-202 82 
xt. In this study the rrelatio a Biological Science 101-102 .84 
‘ ake the estimated « t and inserted OR eT pa 9 rey 
: The sree commenaiity fo cansl to th Physical Science 101-102 i i7 
of the factor loadings of the common fac- A sy f W I t 
fact that the test h n common with factor has been removed. It was not at the gin in the 
Communality tect J,2 22 » . 
s pace 
TABLE I 
Social Social Biological 
English Science Science Humanities Science 
101-102 101-102 201-202 201-202 101-102 
science 101-102 62 
al Science 201-202 63 75 
anities 201-202 i 61 62 .76 
logical Science 101-102 .66 .68 .66 69 
cal Seience 101-102 56 67 58 54 .69 
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[he residual correlations were then calcu- 
lated by means of equation (13). These are 
given in Table II. 

None of the residuals is significant, a fact 
which indicates that the structure of each of 
the variables is that of a factor common to all 
f the variables, plus a factor specific to that 
variable. Calculation of the correlations of 
each of the tests with the common factor by 

eans Spearman method,*° which gives 

identical results only when one common fac- 

tor is present, gave the following values: 
English 101-102 __---- 
Social Science 101-102 —-. 
Social Science 201-202 
Humanities 201-202 ____-- 
Biological Science 101-102 
Physical Science 101-102 


f the 


34 
.85 
.80 
85 
.78 
T he Secon 1 Ja m ple 
In Table III are given the intercorrelations 
of the six comprehensive examinations with 


p. xvi, equation 


between test @ and 5 
f the squares of these 
ntercorrelations 


OF EXPERIMENTAL EDUCATION 


each other and with intelligence as measure 
by the Psychological Examination of th 
American Council of Education. 
The correlations of each test with the 
mon factor, calculated by means of equatio: 
(12), are as follows: 
English 101-102 
Social Science 101-102 
Social Science 201-202 
Humanities 201-202 
Biological Science 101-102 ___- 
Physical Science 101-102 
Intelligence 


é ~] 
— ro DO 


m~ DO 


Calculation by the other i.ethod referred t 
gives the following values: 

English 101-102 

Social Science 101-102 

Social Science 201-202 

Humanities 201-202 

Biological Science 101-102 

Physical Science 101-102 

Intelligence 


In Table IV are given the residual corre- 
lations calculated through use of the first set 
of values. 

There are somewhat grea 


ter discrepancies 
between the two series of 


and 


\ ilues 


TABLE II 


English 

101-102 
101-102 .03 
» 2901-202 ..)2 
01-202 " 
101 


ence 101-—1( 


02 
1 
03 


ocience 


Sc 
TABLE 
Social 


Science 
101-102 


English 
101-102 
10 


04 


So 102 


202 


al Science 101 
Social Science 201 
Humanities 201-202 . 
Biological Science 101-102 
Physical Science 101-102 
Intelligence 


| I] 
70 
.66 
.65 


.23 


AY 


TABLE 


Social 
Science 
101-102 


English 
101-102 

00 

09 

O04 

06 

01 

14 


101-102 
202 


Social Science 
Social Science 201 
Humanities 201-202 
Biological Science 101-102 
Physical Science 101-102- 
Intelligence 


+.06 
+.05 

.00 
—.01 


—, 13 


Social 
Science Humanities 
201-202 201-202 


Social 
Science 
101-102 


04 
OT 
— 03 
02 
III 
Biological 


Science 


101-102 


Social 
Science 
201-202 


Humanities 
201-202 


.80 
61 
.60 
19 


IV 
Social 


Science 
201-202 


Physica 
Science 
101-102 


Biological 
Science 
101-102 


Humanities 
201-202 


+.15 
—.05 
—.06 
—.17 
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ils are slightly larger. Hence, when  telligence. One can interpret the correlation 
gence is included as a variable, a single of a comprehensive examinatic 
n factor is not sufficient to explain en- 
the original correlations. Other, though It may be said that approximately 65 per cent 
or, common factors exist. of the variance (or less precisely, variation in 
correlations between the tests and the achievement) is due to the common factor 
mon factor when intelligence is not in- Assuming a coefficient of reliability of .oo for 
1 as a variable are as follows: (The the typical comprehensive examination, 25 pet 
lumn refers to the centroid method of cent of the variance, or 
ilation, and the second column to Spear- ment, may be 
method. ) 


i with the com- 
mon factor in terms of individual differences 


variation in achieve- 
ascribed to minor general fac- 
tors, or to abilities specific to the given survey. 


‘ 109 
{) t} 


101 102 
7 


. The results of this study support the con- 
tention that the survey courses qualify as sur- 
vey courses in that the abilities required are 
largely common. While special talent may be 
required for superlative achievement in a 
narently. one factor is sufficient to ex- given survey course, average, or even superior, 
the intercorrelations between the scores achievement is possible for the student of 
the comprehensive examinations. average or superior status with respect to the 
, general ability. Conversely, satisfactory 
achievement in a given survey course is pos- 
[he common factor may represent a com sible for any student whose achievement in 
of abilities. It has something in com- the other surveys is satisfactory. If a stu- 
with what is measured by an intelligence dent’s achievement in one survey course is 
but it is not restricted to intelligence as unsatisfactory while his achievement in the 
red. This was shown by the fact that others is satisfactory, the failure is probably 
typical comprehensive examination corre- the result of lack of interest because of in- 
"80 with the common factor, while in- adequate motivation rather than the result of 
ence correlates .44. It is possible that lack of some special ability. 
mmon factor includes such traits as , 
rseverance” or other attributes of charac- interpretation of coefficients of correlation, see: Engelhart 


e hometrika 
may compensate for a lack of in- 


ocience 


ti. » & kt = 


00%! % 


x 


ustons 





SOCIAL COMPETENCE OF GRADE SCHOOL CHILDREN 


KATHERINI 

R ed ch { 

fests of verbal intelligence and scholastic 
if hit vemel 


when 


sincation of graae children 
materials of instruction have been essen- 
ly academic. However, since modern edu- 
ition emphasizes the preparation of the child 
ind organizes subject matter 
in terms of interests, activities, and social 
it is becoming desirable to include in 
the classification criteria measures of social 
attainment. The purpose of the present 
paper is to call attention to a recently devel- 
oped scale for measuring social maturity, and 
to report the results of administering the scale 
to a group of three hundred children attend- 
rade school. 
The Vineland Social Maturity Scale meas- 
ures social development in terms of personal 
independence and responsibility. In infancy 
ind early childhood social maturity is re- 
flected in self-help, at adolescence in self- 
direction, and in adult life as assumption of 
responsibility for others. The successive 
items of this social scale represent progressive 
maturation in self-help, self-direction, social 
relations, locomotion, occupation, and com- 
munication. The items are divided into age 
groups representing increasing degrees of so- 
cial competence. These genetic levels of per- 
formance are considered as successive stages 
of social maturation. 


~( hx ” )] 


or daily living, 


needs, 


ing 


SocIAL MATURITY SCALE” 


Years 
0-I 
“Crows”; laughs 
Balances head 
Grasps objects within reach 
Reaches for familiar persons 
Rolls over 
Reaches for nearby objects 
Occupies self unattended 
. Sits unsupported 
9. Pulls self upright 
). “Talks”; imitates sounds 
Drinks from cup or glass assisted 
Moves about on floor 
Grasps with thumb and finger 
Demands personal attention 
Stands alone 
. Does not drool 
. Follows simple instructions 
assistance of Dr. Edgar A. 


VINELAND 


l 
I 
l 
I 
l 
l 
l 


* The author acknowledges the 


D n the treatme lata and 


t have been used as criteria for the 


Items 


P. Brapway, M. A.* 
istant, The Training School at Vineland, New Jersey 


Years 

I-ll 

Walks about room unattended 
Marks with pencil or crayon 
Masticates food 

Pulls off socks 

Transfers objects 

Overcomes simple obstacles 
Fetches or carries familiar objects 
Drinks from cup or glass unassisted 
Gives up baby carriage 

Plays with other children 

Eats with spoon 

Goes about house or yard 
Discriminates edible substances 
Uses names of familiar objects 
Walks upstairs unassisted 
Unwraps candy 

Talks in short sentences 


II-IIl 


Asks to go to toilet 

Initiates own play activities 
Removes coat or dress 

Eats with fork 

Gets drink unassisted 

Dries own hands 

Avoids simple hazards 

Puts on coat or dress unassisted 
Cuts with scissors 

Relates experiences 


III-IV 


Walks downstairs one step per tread 
Plays cooperatively at kindergarten 
Buttons coat or dress 

Helps at little household tasks 


. “Performs” for others 


Washes hands unaided 
IV-V 


Cares for self at toilet 
Washes face unassisted 


. Goes about neighborhood unattended 


Dresses self except tying ; 

Uses pencil or crayon for drawing 

Plays competitive exercise games 
V-VI 

Uses skates, sled, wagon 

Prints simple words 

Plays simple table rames 


Is trusted with money 
Goes to school unattended 


VI-VII 


Uses table knife for spreading 
Uses pencil for writing 
Bathes self assisted 


. Goes to bed unassisted 








SOCIAL COMPETENCE 


Years 
VII-VIII 
time to quarter hour 
: table knife for cutting 
Disavows literal Santa Claus 
pates in pre-adolescent 
or brushes hair 
VIII-IX 
utensils 
itine household 
Is on own initiative 
es self unaided 
IX-X 


avdi 


ols o1 


1 
tasks 


occasional sho 
telephone calls 
small remunerative work 
purchases by mail 
XI-XII 
simple creative work 
eft to care for self or others 
njeys books, newspapers, magazines 
XII-XV 
Plays difficult games 
xercises complete care of dress 
wn clothing accessories 
adolescent group activities 
ms responsible routine chores 
XV-XVIII 
mmunicates by letter 
Ws current events 
es to nearby places alone 
es out unsupervised daytime 
own spending money 


ys all own clothing 


XVIII-XX 
: to distant points alone 
Looks after own health 
Has a job or continues schooling 
Goes out nights unrestricted 
Controls own major expenditures 
Assumes personal responsibility 

XX-XXV 
Uses money providently 
Assumes responsibilities beyond own needs 
Contributes to social welfare 
Provides for future 

XXV-+ 

Performs skilled work 
Engages in beneficial recreation 
Systematizes own work 
Inspires confidence 
Promotes civic progress 
Supervises occupational pursuits 
Purchases for others 
Directs or manages affairs of others 
Performs expert or professional work 
Shares community responsibility 
Creates own opportunities 
Advances general welfare 


S ads, 


ages in 


Although patterned in principle after the 
Binet Scale for measuring intelligence, this 
social scale does not require direct examina- 
tion of the subject. The method provides in- 
stead for a record of habitual performances 
obtained by interviewing someone intimately 
familiar with the person examined. The final 
is reckoned from the total number of 
items successfully performed, and this score 
may be converted to a social age score by re- 


score 


ferring to the scale directly without reference 
to a table of norms. A social quotient is ob- 
tained by dividing the social age by the life 
age up to life age 25. For adults older than 
25 years, the remains 25, just 
divisor for Binet intelligence quotients re- 
mains 14 (or 16) after the age of 14 (or 16) 


divisor as the 


lhe validation and normal standardization 

| the scale are reported in detail elsewhere 
7). Its reliability on re-examination is high 
(ry = .93), and its validity as determined by 
correlations between estimates and obtained 
scores is also high (r 85). The scale is 
relatively free from sex differences, and per 
formance on the scale is not correlated with 
status, except for the relation of the 
latter to intelligence. 


4 | 
social 


Application of the Scale 


Dr. Roy F. 


Street, Director of Mental 
Hygiene, Anne J. Kellogg School, Battle 
Creek, Michigan, generously put at our dis- 
posal some early results he obtained by ad- 
ministering the Vineland Social Maturity 
Scale to children at the Anne J. Kellogg 
School. This school provides for classifying 
children within grades according to the indi 
vidua the children. In addition 
to the regular classes there are several special! 
The data which Dr. Street referred 
to us included life social maturity 
scores*, Binet mental ages, and percentile 
rankings on the Pintner Pupil Portrait test, 
for 310 children distributed in the following 
classes: regular classes of grades 4, 5, and 
6: slow to retarded classes of grades 4, 5, and 
6: retarded classes of grades 7 and 8; gifted 
classes of grades 5, 6, 7, and 8; remedial 
classes of grades 4, 5, and 6; and open-air 
classes of grades 4, 5, and 6. The non-con- 
formity of the groups for which we have re- 


abilities of 


classes. 


ages, 


* The first published form of the Social Maturity Scale 
(Form A) was used in obtaining the social maturity scores 
However, the 1936 norms were used in converting social scores 
to social ages. It should be noted here, too, that the item 
lifferences between Form A and the revisec wm. (Form B 


the leve it w \ rade school } drer 
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sults limits the study somewhat, but the data 
are adequate to show certain trends. 

Study of the material revealed that its 
treatment would throw light on three prob- 
lems which were important in relation to the 
possibility of using the scale as a criterion in 

classification. These problems 

i in social maturity of 

retar regular, and gifted 
ition between social age and 


educational 
wert (1 
childrer ded 
classes: (2) re 
mental age: (3) relation between social com- 
petence and personality as measured by the 
Pintner Pupil Portrait test 


R lation fo 7 v pr ot ¢ lass 


means and extreme devi- 
social quotient, 


hows the 


ire SOoCclal age 


EXPERIMENTAL EDU 
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The column labelled “SA” in Table 
cates that there is a marked relation bet) 
social age and the type of class in whi 
child is placed. The mean SA increases { 
the retarded to the gifted class for each ¢ 
although the corre sponding LA’s decrease 
will be noted that the differences betwee: 
mean SA’s of the retarded and gifted c! 
of any one grade are all greater than 3 year 
difference between the mean SA 
of the of grade 4 and th 
tarded class of grade 8 is only 1.5 | 
The gifted show a more significant inc: 
from grade to grade than do the retards 
The gifted increase from a mean SA of 
at grade 5 to a mean SA of 15.6 at grad 


whereas the 


retarded class 


TABLE 1 


tXTREMI 


MEANS 


SQ MA _ IQ 


SA 
88 83 82 
104 


Giftes 


Grade 6 
Slow to Ret 
Reg ilar 


Giited 


Grade 7 
Retarded 
Gifted 

Grade 8 
Re tarded yA ‘ 10.1 
Gifted ( + ! 16.8 


i 


73 
128 
mental age, and intelligence quotient for the 
regular, and gifted classes. Table 2 
responding means for the remedial 


retarded 
shows ci 


and open-air classes. 
TABLE 2 
AND OPEN-AIR CLASSES 
MA IQ 


REMEDIAL 


Remedia! 11 % 9.7 104 
Air y 9." 9.2 JO v ‘ 


Open 


Grade 5 
Reme 
Open- 


108 


93 


107 
105 


9.9 
10.9 


10.6 
11.4 


10.2 
10. 


lial 

Air 

Grade 6 
Remedial : ) 11.: 94 


Open-Air 91 


DEVIATIONS FOR RETARDED, REGULAR, 


10.5-11.7 


11.1-12.$ 


10.9-—12. 


10.3—16.3 
11.9-13 


AND GIFTED CLASSES 


DEVIATIONS 
MA 


EXTREME 
SA SQ 


73-100 
85-130 


LA 


6.8— 9.4 
8.3—12.4 


10.8 
10.7 


Jd 


8.8 


6.8—11.0 
8.5-14.0 
11.0-14.0 


60-104 6.7—10.4 
80-135 9.3—13.0 
106-149 12.3—13.8 


9.7-11.6 
9.4-10.9 


46-103 
84-132 
104—127 


5.2—13.0 
10.0—15.0 
11.0—14.0 


9.8-11. 


42— 87 
91-130 


5.8-12.0 
11.7—16.0 


8.0—12.0 
14.0—17.0 15.6-18.0 

In the regular classes there is a 1.3 year i) 
crease from grade 4 to grade 6. 

The next column, labelled “SQ,” indicat 
the relation between social quotient, or rela- 
tive social maturity, and type of class. The 
mean SQ for the retarded classes drops fror 
88 to 71, for the regular classes from 108 t 
100, and for the gifted classes it varies be- 
tween 113 and 120. The range of SQ’s pre- 
sented in Column 9 of Table 1, shows that 
there is no overlapping of the SQ’s for th 
retarded classes with those for the gifted 
classes of the same grade. 

On the basis of these data we may conclud 
that social maturity varies appreciably with 
type of class, and to a lesser degree with 








SOCTAI] 


This, of course, merely reflects 
operative in producing the 
pupils in the existing 


grade 
fluences 
fication of the 


ar analysis may be made from Table 2. 
there are less consistent differences, pos- 
due to the smaller numbers of subjects 

several groupings. The differences be- 

the mean SQ’s of the two types of 
; are not large. 


n Between Social Age and Mental Ag 
nsidering the use of the Vineland So- 
Maturity Scale in conjunction with intel- 
tests for classification of grade school 
it is important to know the relation 
social maturity and intelligence. 

se correspondence between the mean 
ize and the mean social age for each 
in successive may be 

. For the regular classes the 

1us the mean SA varies from 
For the retarded 
is below the mean SA, and for the 
in MA is above the mean 
he inferiority of mean MA to mean 
retarded classes has been reported in 
li l The direction of 

is accounted for by the fact 


grades 


classes the 


sses the me 


Is Studies 


differences 
ental age is one of the major criteria 


COMPETENC! 


for the segregation of children in these classes 
for the retarded and the gifted 


Che relation between social age and mental 
age for the complete group of 310 subjects 
(including the remedial and open-air 
is shown in Table 3. The mean MA and 
mean SA for the total group are identical, i.e., 
11.3. The standard deviation of MA is 2 
per cent higher than that of SA. The corre 
lation coefficient for these data was 7 73, 
which corrected for LA gave the partial corre- 
lation 7 72. To avoid the possibility that 
the inclusi grades 7 and 8, in which only 


isses } 


yn of 
the extreme classes were represented, was arti 
ficially raising the correlation, a second corre- 
lation was computed in which the children 
from these grades were eliminated from the 
data. The correlation 62, which 
corrected by partial correlation for LA re- 
mained r 62. 


was 7 


The correlation be SA and MA fo 
the population of an institution for the feeble 
found to be r OF 
iter heterogeniety (wider spread ) 
of the institutional group might account for 
the higher correlation, these correlations must 


ye compared in terms of their standard devi- 


tween 
minded /{ \ bh heen 
minded (7 nas een 


~e J 
since the gre 


} 
ations The standard error of estimate af- 
fords the for such For 


basis a comparison. 


TABLE 3 


RELATION BETWEED 


SOCIAL 


AGE A) 
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both the total grade school group and the 
group with grades 7 and 8 


(oV1I r*)* wet 


S¢ he “ )] 


this 


grade 


omitted value was 


For the feeble-minded group the relation was 
1.5. These are comparable and suggest that 
social maturity and Binet intelligence are re- 
lated to the same degree for the feeble-minded 


as they are for the intellectually ‘normal.’ 


Relation Between Personality and Soctal 
Vaturity 

Another problem which is important in con- 

sidering the Social Maturity Scale as a classi- 
fication instrument is the relation between 
personality or adjustment and social maturity. 
Che Pintner Pupil Portrait test (11) had been 
administered to 278 of the 310 subjects exam- 
ined with the Social Maturity Scale. The 
Pintner test is a self-administering test for 
school pupils in grades 4 to 8. The test is 
composed of roo statements in the form of 
impersonal descriptions of another child, and 
the pupil is asked to indicate whether he acts 
or feels the same or differently about that par- 
ticular situation or person. The statements 
consist of such statements as: “This child 
thinks school helps children.” “This child 
is often blamed for things he does.” The 
scale is scored according to the number of 
correct’ responses, 100 being perfect. The 
“correctness” for good adjustment is based on 
a validation study made by Pintner and others 
(10) in validating the test. The higher the 
score, the better adjusted presumably is the 
child. 

The correlation between SQ and score on 
the Pintner test for the total group of 278 was 
found to be r .36. When this correlation 
was corrected for life age, the partial correla- 
tion became r 41. 

The results indicate a low positive relation 
between Vineland Social Maturity scores and 
Pintner Pupil Portrait test scores, that is, a 
slight relation between relative degree of so- 
cial maturity and adjustment. 


Conclusions 

A study of the results of the application of 
the Vineland Social Maturity Scale to a group 
of grade school children in the 4th to the 8th 
grades has been reported to throw light on the 
possibility of using this scale in educational 
classification programs. The results show 
that there are significant differences in social 


* The average of the o's for the two distributions was used 
as the o in this formula 


maturity of retarded, regular, and gifted 
classes as grouped by existing procedures, that 
there is a close relation between intelligen 
and social maturity, and that there is a | 
positive relation between adjustment and so- 
cial maturity. 

The high relation between social maturity 
and intelligence might seem to indicate that 
both measures are not needed; that one mea 
ure is a sufficient indicator of both. How- 
ever, the spread of SA’s for any one MA, as 
shown in Table 3, reveals that there would be 
considerable error of prediction in individual! 
cases,—as much as 3 or 4 years. It would, 
of course, be unwise to use social maturityy 
alone as a criterion for class placement, since 
degree of intelligence seems to be the more 
important factor in learning. However, it 
might be practicable to use the two together 
The inclusion of gifted and retarded classes 
within each grade of a school system would 
seem to provide a means of adjusting the 
placement according to both intellectual and 
social maturity. 

The value of the scale in the study of indi- 
vidual problems might also be emphasized 
Poor adjustment of some children may be re- 
lated to discrepancies between social maturity 
and mental maturity. A child who is intel- 
lectually superior to children of his own age 
but is not correspondingly superior in social | 
competence, and so is unable to compete so- 
cially with children older than himself, maj 
find no group to which he “belongs.” The 
clear recognition of such a discrepancy be- 
tween mental and social maturity may facili- 
tate a satisfactory solution of the childs 
problem. 
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A STUDY OF OSCILLATION AS A UNITARY TRAIT 


MARIAN E. MADIGAN 


Vilwaukec 


Vocational School 


Milwaukee, Wisconsin 


DUCTION 


No one can “attend”’ continuously to a task. 
With the best of attention output fluctuates. 
Flugel showed that people who vscillate 

ely in one thing tend to oscillate widely in 

things, and vice versa. It was on the 
is of this investigation that Spearman, ap- 
he crite! tetrad differences laid 

iim to a general factor of oscillation, sym- 
bol 1 by hi 

[HE PROBLEM 


iper represents an attempt to investi- 
17 


g thoroughly the behavior-unit of 

lat \ explicitly stated, its pur- 

t y the scheme of intercorrela 

, tetrads, and factorial analysis, in con- 

ection with other variables, to ascertain th: 

existence of oscillation as a unitary trait. 

akin to the main problem will be the 

consideration of the validity and reliability 

( tion tests, the effect of two meth- 

( g the ind a comparison of os- 
ci f children with that of adults. 

Dat 
Some thirty variables centering around the 
ed factors of oscillation, perseveration, 


spatial relations, mental speed, motor speed, 


attention, fluency, and memory were selected. 
This batte was given to 117 adults, the 
majority being graduate students. 

[he tests utilized in this study were some 
f those previously administered to school 
iildren in the Spearman—Holzinger unitary 
traits study In the oscillation series, Test 
42 was made up of clusters of dots, Test 42 
utilized capital letters, Test 44 made use of 
segments of “X”’ and a “[]”, Test 45 was 
omprised of a series of digits, and Test 46 


,a“C]”, and 


The first four tests have their con- 
tent distributed along zigzag lines which twist 


consisted of the segments of “X.”’ 
‘TV? 


irregularly back and forth across a page, 
C. Spearman The Abilities of Man. New York: Macmil- 
ry S26 
iminar Repor mn Spearman—Holzinger Unitary Trait 
S V Prey 1 at the Statistical Laboratory, Depart- 
Ed Universit f Chicago, 1934 


w 


te 


18 14. The symbols in Test 46 were 
ributed in straight lines across the page 

In Test 42, the subject was to count 
number of dots; in Test 43 he was to dis 
tinguish between capital letters made entir: 


of straight lines and those involving curv 
lines: the task in Tests 44 and 46 was 
identify the symbols to which the segme 


belonged; and in Test 45 the subject wa 
add the digits by twos, putting down only tl 


units figure. 


Experimental set-up 

Preliminary investigations in administeri 
the oscillation tests as group tests had resulted 
in the waste of a large amount of data 
order to minimize this loss and secure relia 
data, the oscillation tests were administer 
individually. The remainder of the batt 
was administered as group tests. 


al a! 
U 


Since measures of were to be der 
from the variation in output of precise 
of mental work, an accurate automatic 
ing device was arranged which clicked off 
tervals of five seconds. The subject was 
monished to work as rapidly as possible 
order that speed could be kept constant 
was carefully instructed to put a tick-mark 
his paper at the place where he was worki 
when the hammer clicked and then continu 
working at top speed. In the effort to main 
tain top speed some of these interval-mar 
were omitted. It was the business of t 
examiner to note carefully their omissions and 
insert them after each test. Good records « 
responding were thus secured in every cas 
The administering of the tests indivi 
ually, not only made for better respondir 
but it allowed the subject to complete eac! 
test. A short form, approximating about hal! 
the content of the long form, preceded each 
“o” test. 


TY 
ry 


] 


itn 
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VALIDITY AND RELIABILITY 


Validation 

Internal consistency among supposedly re- 
lated variables was the basis on which valida- 
tion was studied. If four or more measures 


cere 


tter 
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elate positively, then there is justification 
stulating a trait which is indicated by 
_ of the measures*. Intercorrelations for 
the short and the long forms were 
lered. 
he average of the ten intercorrelations for 
he short forms was .2072 and the average for 
‘ ong forms of the same set was .4190. 
vever length was not the only factor mak- 
for the increase in the intercorrelations. 
lhe average intercorrelations showed that a 
t of 60 five-second intervals yielded as good 
Its as one 180 five-second intervals in 
Inasmuch as the number of different 
; in each test was relatively few, thereby 
iking for a great deal of repetition through- 
it the test, the set of “o” tests was consid- 
relatively constant in difficulty. On the 
issumption that these tests measure oscilla- 


— - 


tion, and the intercorrelations lend support to 
this, it would seem that the better intercor- 


lations are due to the content of the tests. 
king first, is Test 43. 

Scoring was based on the mean deviation 

the method of differences. In 

this latter method, if a subject’s output were 

nstant, there would be no differences, but as 

his output fluctuates the absolute change from 

| to interval is summed and divided by 

number of intervals for his score. Flugel 

es the statement that he adopted this 
nethod instead of the mean deviation 

because of its greater simplicity from 

nt of view of calculation and be- 

the measure of variability obtained 

ch a method is relatively unaffected 

constant tendency of the value to rise 

fall—such a constant tendency being of 

rse present in our data in the shape of 


lo the writer this suggested that intercorrela- 
tions secured by this method would be higher 
than by the deviation formula. The compari- 
son was made on the basis of the longer forms 
because fatigue would be expected to be pres- 
ent to a greater extent than on short forms. 
Che results in Table I do not support the in- 
vestigator’s interpretation. 

“Holding amount constant” means that the 
subject finished the test regardless of time. 
“Time constant” basis is the time that it took 
the fastest subject to complete the test. In 

L. L. Thurstone, The Reliability and Validity of Tests, 


p. 101. Ann Arbor, Michigan: Edwards Brothers Inc., 1932. 

5. € Flugel, Practice, Fatigue, and Oscillation. The Brit- 
ish Journal of Psychology Monograph Supplement, No. 13, 
p. 66. Cambridge University Press, 1928. 


this situation the results are calculated on the 
same number of intervals varying per test 

A comparison of the results in Table I show 
the averages for the ““Time Constant” basis to 
be lower. ‘hat is, the length of the test has 
been reduced and in all but three instances, 
the correlations are lower. 

From the foregoing results, it is evident 
that the validity of oscillation tests on the 
basis of internal consistency depends on 
length. That is, other things being equal, 
the longer the tests the better the inter-corre- 
lations resulting therefrom. But investiga- 
tions reveal that beyond 60 five-second time 
intervals, ‘“‘o” tests do not yield increasingly 
better results commensurate with the time and 
labor involved. Of the two methods of scor- 
ing, the results favor the mean deviation 
method. 





Reliability 

Since the ‘“‘o”’ tests are the continuous func- 
tion type of test, alternate items or intervals 
‘ould not be used. Therefore it was neces- 
sary to use the first half of the test and com- 
pare it with the second half. Since the con- 
tent of the tests was of the same level of dif- 
ficulty throughout, it does not seem that it 
would be objectionable to use continuous 
halves for a basis of reliability. 

The primary purpose in administering the 
short forms of each “‘o” test was to thoroughly 
acquaint the subject with the test before tak- 
ing the long form. It so happened that the 
content of the short form was identically re- 
peated in the beginning of the long form. By 
cutting the long tests off to parallel the con- 
tent of the short tests, two identical forms of 
the same test were available. Since the long 
form doubled the content of the short form in 
three tests, this allowed an empirical check on 
the use of the Spearman—Brown formula. 
The other tests were about one and a half 
times their short forms. 

One of the most illuminating bits of evi- 
dence from a study of these results in Table 
II is the fact that the average of the correla- 
tions by identical forms is practically the 
same as the average resulting from the relia- 
bility of the long forms. It is the opinion of 
the writer that such a comparison as this 
yields conclusive evidence bearing on practice 
effects. There was no intention in the mind 
of the examiner to hold practice effects at a 
minimum. In this case they might well be 
considered at a maximum since the short 
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TABLE I 
CORRELATIONS OF “0” TgesTS BASED ON TWO DIFFERENT METHODS OF SCORING 
Amount Held Constant Basis 
Mean Deviation Method Flugel’s Method 

43 44 45 46 43 44 45 46 
4‘ 051 3062 .3908 .33875 3171 .2365 4339 32K 
5100 4557 .5086 .4630 .8962 4385 
4 : 4003 4928 4199 420 
aa 4826 408 

A ve i .4190 3861 

Time Held Constant Basis 
Mean Deviation Method Flugel’s Method 

Te 3 44 45 46 43 44 45 46 
42 2747 1795 3477 .3059 .2948 .1961 .3973 .2691 
j 4464 .4345 .4981 .4253 .3069 4205 
i4 3153 .3670 .2871 .3078 
16 4939 0045 

Ave nk 3663 3260 
forms approximated half the content of the _ results differing by only .03 to .o7 from those 


long forms. Such evidence as this should 
allow the experimenter to provide ample prac- 
tice periods for subjects with no fear of prac- 
tice effects. 

Corresponding halves of these “o”’ tests 
were correlated and stepped-up by the Spear- 
man—Brown formula to note their compari- 
sons with the intercorrelations resulting from 
the complete tests. On the “Amount Held 
Constant” basis the average of the intercor- 
relations on the first half of each test was 
4554. For the second half it was .4914. The 
average of the intercorrelations on the com- 
plete tests was .4188. On the “Time Held 
Constant” basis, these averages were respec- 
tively .4332, .3892, and .3605. Thus, the 
Spearman—Brown formula applied to corre- 


obtained from the complete tests. 

The results of this section indicate that the 
“o” tests are stable in yielding reliability co- 
efficients of the same magnitude on the con- 
tinuous half method as well as on identical 
forms. Practice effects are negligible and 
length plays relatively the same role in reli- 
ability as in validity. The Spearman—Brown 
formula applied to intercorrelations of cor- 
responding halves yields results similar t 
those obtained from the complete tests. 


Tue EFrrect oF INCREASING THE TIME UNIT 
ON THE MEASUREMENT OF OSCILLATION 


The results of a preliminary investigation of 
25 cases on varying the time unit, pointed to 
the operation of a law in measuring oscilla- 


sponding halves is sensitive enough to yield tion. The possibilities inherent in such an 
TABLE II 
RELIABILITY OF 117 Cases or “0” Tests INVOLVING TWo DIFFERENT METHODS OF SCORING: 
ALL RESULTS STEPPED UP BY THE SPEARMAN-BROWN FORMULA 


Mean Deviation Method 


Flugel’s Method 








Amt. Time Amt. Time 
Identical Held Held Held Held 

Test Forms Constant Constant Constant Constant 
2 . sind acids sean 4961 5857 5215 .4937 
43 = .. .6943 .7615 .6609 .6937 .6005 
44 - ee .6727 .5209 .5967 4322 
45 SE Se oe _. .6610 .6695 .6897 .6921 .6029 
46 - : a .6563 .6563 .4943 .4943 
Ave - .. .6501 .6512 .6227 5997 5247 
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Figure 1.—The graphs of oscillation for children and adults based on 
two methods of scoring on varying time units. 





operative factor warranted an investigation on 
the larger sample of 117 cases. 


Basis of Procedure 

The oscillation tests were scored on the 
basis of 5-, 10-, 15-, 20-, and 25-second inter- 
vals. Two methods of scoring were investi- 
gated. 

In order to provide the same length for all 


subjects on a test, the ‘““Time Constant” basis 
was used. Three tests were approximately 
45 five-second intervals in length while the 
other two were 70 and 114 five-second inter- 
vals. The last step involved the intercorre- 
lations of these scores on the five different 
time units according to the two methods of 
scoring. 





VAL OF 


investigation dealt with 

> cases with adults. In Fig- 
graphs have been displayed to- 
obvious that scoring by either 
smallest available time unit 
sults. One might raise the 
this unit of time 
duce the maximum 
Beyond a 15-second inter- 
‘nd rapidly toward zero. 


how small 


rder to pr 


the two methods comes 


nd and the 1s-second in 
between the and 25- 
for children while the plateau 
tion method occurs between 
ynd intervals. But in the 
the parall lism occurs 


second intervals and 


between the ro- and 15 
The almost down 

Flugel’s method for children 
second interval while a some- 

er descent starts at the 1o-second 
idults. Likewise the sharp 

the mean deviation method for chil- 
ites at the to 25-second interval 


smoother descent operates at the rs- 


direct 


nterval for adults 
nce as this that points to 
may be in- 
[he more 
ncy has been conditioned by ex 
til it takes a finer unit of measure 


that perhaps maturation 


cca 
this factor of oscillatio 


5 presence 


RR ANALYSIS OF THE DATA 


factor analysis of these data proceeds 
g to the Two-Factor method of Spear- 


from using the tetrad criterion, 
ils were resorted to most freely. Resid- 
be insignificant if the tetrads 
ish for the set of variables. The agree- 
ment of possible factors with that of the 
large residuals has been shown to be suffi- 
ciently accurate for the allocation of suspected 
extra Whenever critical points 
arose, tetrads were used as a check. Thus 
the tedious task of computing thousands of 
tetrads was eliminated, the method of resid- 
uals and, if necessary, the formulation of 
more than one pattern proving the more 
xpedic nt 


Aside 


tactors 


* Spearmar Preliminary Report on the Unitary 
23. Prepared at the Statistical Labora- 


of Chicago, 1934 


Holzinger 
V + 


ry. Department of Education, University 
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Indications of Factors 

The intercorrelations of the 29 variables ; 
general show a high degree of consiste; 
amongst the clusters of tests for each assumed 
factor and, with the exception of menta] 
speed, the correlations amongst the tests rep 
resenting different assumed factors are 
letrads for each of these other sets of four 
more variables are generally significant 
it were not for this overlap, the large value 
r,; for mental speed would indicate a satisfac- 
tory measure of such a composite. With 1 
exception of the perseveration tests, the 
ings in Table III indicate that measures | 
various assumed factors are fairly satisfact 
Such evidence as this allows one to 
with the factorization of the data 


BLE III 


TRADS AND CORRELATIONS rp; FOR 
ASSUMED FACTOR 


P. E. largest tetrad 
in set 

(—.0151) 

.( .2356) 

ue ¢ .0944) 
Ee. (—.0462) 
x 
( 


Assumed 
Factor 

a 
Mental Speed _-- 
Motor Speed 
Perseveration _- 
Oscillatio 
Attention 


.053 
.0116 
.0329 


.0312 


0772) 
.1095) 


The Patterns 

The true factor pattern for any set of \ 
ables is probably very complex. The 
guiding principle in selecting a factor patter 
is to try the simplest one first. A consi 
able number of patterns might be formulated 
as defensible explanations of underlying 
relations. The arrangement of these pat 
terns determines the order in which factors 
are removed and since the first factor removed 
is generally favored, there arises a certain 
amount of ambiguity in the results. This 
ambiguity has been removed by Holzinger’ 
the formulation of the “hollow staircase” type 
In this pattern, after the principal factor has 
been removed, the remaining factors may be 
taken out in any order and the factor loadings 
will not change their order of magnitude. 

The investigator set up four patterns. Pat- 
tern A involved four tests of a kind for five 
of the factors and five tests for the ‘“‘o” fac- 
tor. Pattern B dealt with the consideration 
of an overlap between space and attention, in- 

*K. J 
Patterns’’, 
247-58 


* Spearman—Holzinger, Op. Cit., 


Holzinger and F. Swineford, ‘“‘Uniqueness of Factor 
Journal of Educational Psychology XXII (1932 


No. 5, p. 5 
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the same variables as in Pattern A. 


_ 
the perseveration tests being omitted be- 
f their low intercorrelations and the 
fluctuation with other variables. Pat- 
. was applied to all 29 variables. The 
extra tests, memory and fluency, were 


iTeC 


factors were dealt with in Pattern 


to one of the six assumed factors on 


sis of their correlations. With the ex- 


f Pattern B, the “hollow staircase’”’ 


is used. 


i the Four Patterns 


der to secure a basis of comparison for 
ndary factors throughout each of the 


L€ 


rns, the average variance for each 


factors was calculated. The results 


irized in Table IV. 


TABLE IV 


VARIANCE OF THE GENERAL FACTOR, THE 
CONDARY FACTORS, AND THE SPECIFIC 


T 


FACTORS FOR FOUR PATTERNS 


Per Cent of Variance of Factors 
General Secondary Specifics 
28.00 55.37 
29,23 55.38 
29.54 49.09 


24.76 56.68 


‘ 
) 


ined specifics claim from 49 to 57 
factor weights while the general 

in no case exceeds the variance 
factors. In Pattern B the 
secondary factors is nearly 

| the general factor. Evidently 
ption of a spatial-attention factor 
reased the specifics but has favored 


ndary factors at the expense of the 


P 
fics and increased the general factor. 
ny instances the general factor load- 


ctor. The elimination of the poor 
tests in Pattern C has decreased 


r a test exceed those of the secondary 


pat 


factor. It is this overshadowing of the sec- 
ondary factors that makes it highly desirable 
to be able to analyze or split-up this general 
factor into more specific and meaningful 
components. 


It seems fitting to comment at this point on 
the recent findings of the Committee on Uni- 
tary Traits. Mr. MHolzinger remarks as 
follows: 

In attempting to analyze the U, (general) 

factor we have eliminated Grade as well as 

Age... comparing this . . . wherein only 

Age has been eliminated, we find that the 

weights of U; drops from 28.81 per cent of 

variance ... to only 14.21 per cent 

These results are very promising because 

‘y indicate the possibility of resolving 
the U; factor into a number of basic 
maturity ; 


The fact that the general factor variance in 
all four patterns of this study does not in any 
instance equal the amount of the secondary 
factor variance would tend to support the 
findings of Mr. Holzinger. The subjects of 
the present investigation were adults. Would 


not age and maturation in this case be mini- 


mized thereby reducing the general factor 


loadings and increasing the secondary 
weights? 

The relative importance of the secondary 
factors are given in Table V. It is evident 
from this table that of all 29 variables studied, 
the spatial and ‘“‘o” tests are the best devised 
because they are more highly saturated with 
what they are supposed to measure. Even at 
that, the saturation amounts only to approxi- 
mately 40 per cent. Thus the big problem 
for the factorists still remains,—that of de- 
vising tests that produce high saturations of 
what they are intended to measure 

* Spearman-Holzinger, Op. Cit., No 


TABLE V 


AVERAGE LOADINGS FOR SECONDARY FAcTorS FroM Four PATTERNS 


itor Speed _-_ 
Perseveration 
scillation 


t+ . 
tentior 


i 
atial A 


tention 


Pattern 
B C D Average 
3711 4342 .3870 4045 
.3326 .2685 .2281 .2839 
.0935 .1910 1154 .1383 
1763 we .1419 1615 
.3672 .3950 4067 3889 
a .1640 .1930 1843 
.1972 1972 
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The Magnitude of the Final Residuals for the 
Four Patterns 

Table VI shows the distribution of the re- 
siduals for each pattern with their means, 
standard deviations, and probable errors. The 
smallest mean, almost zero, occurs.in Pattern 
1) where there are the greatest number of re- 
siduals recorded. It is interesting to note 
that the highest mean occurs with the lowest 
frequency and with the increase of the number 
of residuals, the mean decreases. This in- 
direct variation would seem to indicate that 
the variation in the mean is due to chance. 

With only a difference of .oo81 between the 
highest and lowest standard deviation, each 
pattern may be considered as an equally good 
ft. The slight increase in Pattern D is prob- 
ably due to the allocation of extra factors. 
The elimination of the perseveration tests 
yielded a slightly lower standard deviation in 
Pattern C but not enough to be judged 
significant. 

Multiplying the standard deviation of each 
pattern by .6745 and comparing them with 
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the P. E. of zero correlation for 117 cases. the 
P. E. of the standard deviations are al! Jess 


than the probable error of zero correlation 
Using this as a rough basis of approximatior 
one can say that the final residuals of eac} 
pattern may be considered negligible.’ 


The indications of factors by the various 
statistical techniques has been verified by thy 
resulting factor patterns. Each pattern, jud 
ing by the P. E. of its standard deviation 
compared with the P. E. of zero correlation 
for the same number of cases, could be con- 
sidered a good fit. The spatial and ‘‘o 
were most highly saturated with their respec- 
tive factors. 


tests 


SUMMARY AND INTERPRETATIONS 

Che findings presented in this study lead t 
the following conclusions: 

(1) Length up to about 60 five-second in- 
tervals for ‘‘o” tests is sufficient to secure a 
good measure of “o”. Good reliability re 

® Spearman—Holzinger, Op. Cit., No. 4, pp. 3-4 


TABLE VI 


FREQUENCY DISTRIBUTIONS OF FINAI 


Value of Residual 
2700 .2899 
2500 
2300 
2100 
1900 
1700 
1500 
.1300 
1100 
0900 
0700 
0500 
0300 
0100 

-.0100 
0300 
0500 
0700 
0900 
.1100 
1300 
1500 
.1700 
.1900 
2100 
.2300 

~.2500 
.2700 

Total 
Mean 
S. D. 
6745 . S. D. , 7 
P. E. zero correlation 


RESIDUAL CORRELATIONS FOR PATTERNS A, B, C, 


AND D 


Pattern A Pattern B PatternC Pattern D 
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are secured from oscillation tests 60 to 
five-second intervals in length. 
[he application of the Spearman 
wn formula to the continuous halves of 
’ test in general predicted results com- 


rable with the intercorrelations of the com- 


ete tests. 


;) Practice in no way affects the scores 
the ‘“‘o”’ tests. 
The mean deviation method for scor- 


i) 
+ 


g the “o”’ tests proved superior to Flugel’s 
thod in both validity and reliability. 
;) The narrowness of the time unit in 


f 


th in securing good measures of ‘‘o” 


ring proved to be a most significant fact 
and in 
ding good reliability coefficients. 

The graphs of the two sets of data for 
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oscillation from adults and children reveal 
similar characteristics. 

(7) There is some evidence to show that 
children oscillate over a wider range than do 
adults. 

(8) The poolings of the ‘“‘o”’ tests as well as 
their factor weights indicate that they are 
fairly good measures of “‘o”’. 

In general the facts indicate that ‘‘o” is a 
very sensitive behavior-unit. Only by nar- 
rowing the time basis can the individual dif- 
ferences of this trait be measured. Oscilla- 
tion turns out to be a very definite com- 
ponent in human abilities as revealed by the 
factor loadings, — the tests being saturated 
with the “o” loadings and possessing but 
small loadings with the general factor. 








SOME VERBAL ASPECTS OF THE 1937 REVISION OF THE 
STANFORD-BINET INTELLIGENCE TEST, FORM L 


ELDEN 


: 7 
Wher CL ovege 


ved that the old revision of 
many test items 


helic 
Binet 

depend on verbal knowledge for their 

In fact yf the such as 

iry tests and abstract words tests 

» tests of word knowledge. When Terman 
Merrill constructed the new revision they 

d to reduce the ni these ver- 


na were to a extent 


inciudea 


items, 


imber of 
great succes 
They found it ex- 
ne ly aim t | non-verbal tests for 
therefore 


upper question 


es whe s on the upper teveis are 


s subiect to specific disabil- 
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(Year XIV, Aver- 
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III levels) seem to be 
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Form 
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th sixty-five 
criteria: 
quotient. 


poor 


poor 


> # 


one 


A. Bonpb 
Research . 
, Columbia University 


{ssistant 


sigma in total comprehension on the reading 
test used. These children were also matched 
on years in school inasmuch as the policy 
the Mansfield City System is to advance t! 
children one grade a year, segregating thos 
encountering difficulty for special remedial 
struction.* The statistical analysis of 
matching is given in Table I. 

Che correlations of .97 and .73 and 
ritical ratios of .7 and .6 indicate that the 
groups are very similar with respect to intel- 
ligence quotient and chronological age. Be- 
cause reading is highly correlated with inte 
ligence,* the children with high intelligenc 
quotients tended to make the highest 
scores in each group. The correlation of 
in reading ability between the good and p 
readers is evidence of this. However, 
critical ratio of 39.0 definitely establishes th 
fact that the two groups are distinctly diff 
ent in reading ability. The homogeneity 
the t uy ith chrono] 


age results in the lower correlation of .° 


read 


is 


wo ert Ss W respect to 


An analysis of the errors on each Binet t 
was made, and a total of the err 
each item was taken. It was found that t 
poor readers had one or two fewer errors 
most of the items with the exception of t 
reported below in Table IT. 


item 


*A more detailed report 
unsuccessful children from t 


article by ¢t wr in the Journal of Ju 


of the method of segregating 
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TABLE I 


MMARY F THE MATCHID F 


bo 


Re 


177. 
i 


THE GOOD 
Pai 


Iowa 
(Raw Sx 
Good 


Chronological Age 
(Months) 

P or 
Readers Readers 
16 177.2: 115.09 


2 .f 21.32 


Good 
iders 


ee 











Level Item 
XIV 1, Vocabulary ----- 
i 


, Vocabulary ------- 
3, Abstract Words -- 
SA I 1, Vocabulary ----- 
3, Sentence Building 
Ii 1, Vocabulary 
i, Vocabulary —- 


these cases were matched on the basis 
ntelligence quotient, it was necessary for 
noor readers to make higher scores on 
items to compensate for their relatively 
r showing on the verbal tests. 

Intelligence quotients were then established 
regrading each of the Binet tests, omitting 
se items listed in Table II. When any 
were omitted from a year level the pro- 
ire was to increase the credit for the re- 
ing items. The results of this treatment 
nged some of the intelligence quotients as 
1 as fifteen points, usually in favor of the 
readers. The statistical analysis of these 

ts is given in Table ITI. 

TABLE III 
SUMMARY OF THE MATCHING OF THE INTEILI 
GENCE QUOTIENTS OF THE GOOD AND POOR 
READERS AFTER THE OMISSION OF TEST 

ITEMS LISTED IN TABLE II 


Good Poor 
Readers Readers 
ws ae 110.45 
tein eiieal siuabiies 11.8 12.3 
93 
between Means __-_-- 5.0 


° Diff. 


The results listed in Table III indicate 
after the correction had been made the 
r readers on an average had a higher in- 
ligence quotient than the good readers. 
[he correlation between the good and poor 
iders is still rather high, but lower than be- 
re, and the critical ratio of 5.0 indicates 
that the two groups are significantly different 
n intelligence quotient. 


lusions 
1. Ninth grade children with poor reading 
bility tend to have more difficulty with the 
verbal elements of the 1937 Revision of the 


STANFORD-BINEI 


TABLE II 


Test ITEMS MISSED MORE FREQUENTLY BY THE PooR READERS 


Errors 
Good Poor 
Readers Readers 
2 7 
28 40 
19 26 
48 56 
13 49 
58 64 
60 65 


Stantord Binet, Form L, than do children with 
good reading ability. Reading disability cases 
are unable to read as much, or as difficult 
materials, as good readers, and as a result they 
do not come in contact with as many words. 
Cheir vocabularies are not so well developed, 
and they are handicapped when taking any 
test which utilizes verbal elements in its con- 
struction. ‘The fact that the poor readers did 
better on some of the non-verbal materials in 
this study is not an indication that they are 
correspondingly better on that type of ma- 
terial. The poor readers did only slightly 
better on each of the non-verbal test items 
(probably because they had to compensate at 
some place), since they were originally 
matched on the basis of intelligence quotient. 

2. It seems advisable to omit all the verbal 
tests listed in Table II when administering the 
1937 Revision of the Stanford—Binet to a 
reading disability case. 

The writer feels that the Vocabulary Test 
Items on the VIII and XII Year Levels 
would also be a handicap for poor readers. 
Some of the other test items, such as Reading 
and Report on the X Year Level, abstract 
words on the XI and XII Year Levels, and 
the dissected sentences on the XIII Year 
Level probably handicap poor readers. A 
similar study should be made using fifth or 
sixth grade children to establish significance 
of the verbal items on these lower levels. 

3. After the corrections were made on the 
Binets, some of the pairs differed as much as 
twenty points in intelligence quotient. The 
average was about three points, in favor of 
the poor readers. It has been the practice 
of some school psychologists to add a con- 
stant of from three to seven points to an in- 
telligence quotient obtained by a reading dis- 
ability case. This practice is open to ques- 
tion, since it is impossible to determine during 
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4 test situation how much the reading ability 
is affecting the test results. A more logical 
method would be to omit the verbal items re- 
ported above and to determine the intelligence 
quotient on the basis of the remaining test 
items 

4. The change of approximately three intel- 
ligence quotient points in favor of the poor 
readers is not exactly a fair picture. Some of 
the matched pairs did not take the test items 
n the levels where all, or most of the correc- 
tion Consequently they pulled 
down the averages. By omitting the lower 23 
matched pairs an average intelligence quo- 
tient of 117.7 was obtained for the poor read- 
ers and an average intelligence quotient of 
113.4 was obtained for the good readers. This 
difference of 4.3 is more significant than that 
[Table III. By omitting these 


( 


were made 


reported in 


OF EXPERIMENTAL EDUC 


{TION 
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cases a lower correlation and a higher critica 
ratio would also be obtained. 
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FLUCTUATIONS IN THE CORRELATION BETWEEN PSYCHO- 
LOGICAL TEST SCORES AND UNIVERSITY GRADES 


Dewey B. Sturt 
University of Nebraska 


e psychological examination is an instru- 
t widely used for the purpose of predict- 
educational and vocational success. In 
ege personnel work we are anxious to de- 
e early in his carreer a student’s prob- 
success in academic work. It is very 
ible, therefore, to have available instru- 
its which will predict scholastic success 
, rather high degree of precision. The 
tiveness of a personnel program is in part 
ependent upon the intelligent use of prognos- 
measuring devices. 
[t was the purpose of this investigation to 
rmine the relationship between Ohio Psy- 
iological test scores and freshman grades 
ver a five year period. If there is consider- 
ible relationship between grades and psycho- 
test scores, then it should be possible 
ise the psychological examination for the 
irpose of guiding the individual with respect 
Particularly is it 
cessary to know whether the instrument 
sed for purposes of prediction operates with 
great deal of consistency from semester to 
semester and from year to year. If we can 
expect the same degree of efficiency from year 
to year, then it becomes possible to place 
greater reliance upon the psychological test as 
i prognostic instrument in college personnel 
work than would otherwise be the case. 
lhe Ohio Psychological Examination, Form 
17, was administered to the majority of 
leachers College freshmen from 1932 to 1935 


his educational program. 


inclusive. Form 18 of the examination was 
used in 1936-37. The scores on these tests 
are recorded in the office of the freshman ad- 
viser together with other personnel data per- 
taining to Teachers College freshmen. 

Grades are reported at the University of 
Nebraska in terms of percentages. Roughly 
speaking a grade of 60-70 can be considered 
a D, 70-80 a C, 80-90 a B, and go—100 an A. 
The university average is usually about 75 or 
76. In this study grades are reported in 
terms of weighted averages, that is, a five-hour 
course carrying a grade of 80 is given a cor- 
respondingly greater weight in the average 
than a three hour course in which a grade of 
80 was earned. 

The most significant findings of the study 
are reported in Table I. It will be noted that 
the correlations for the first semester vary 
from .43 to .62 and for the second semester 
from .41 to .58. As one would expect, the 
best results were obtained with Form 18. 
The standard errors of these correlations vary 
from .o4 to .o6. It will be noted also that in 
1933-34 and in 1935-36 the correlation for 
the second semester is greater than it was for 
the first semester. While one might antici- 
pate a lower correlation for the second semes- 
ter because of a more homogeneous population 
being enrolled at that time, the results do not 
bear out this hypothesis. Examination of the 
standard deviations will show that the vari- 


TABLE I 


MEANS, STANDARD DEVIATIONS AND COEFFICIENTS OF CORRELATION FOR 
FRESHMEN GRADES AND OHIO PSYCHOLOGICAL TEST SCORES 


N Mean 

Psych. Score 

Semester Semester 

1 2 1 2 1 
206 176 96.60 97.90 32.80 
169 153 93.50 94.40 33.20 
245 202 90.40 94.60 34.30 
231 203 89.70 92.79 33.10 
221 206 76.20 76.61 22.15 


18. 


S.D. 
Psych. Score 
Semester 


S.D r 

Grade Grade Grade & Score 
Semester Semester Semester 

2 1 2 1 2 1 2 
31.90 76.86 177.88 9.48 8.10 .46 Al 
33.20 77.37 77.07 8.43 9.00 .54 .58 
33.70 75.96 76.53 9.84 8.58 .53 45 
33.00 77.58 78.75 10.23 846 .43 46 
22.42 78.56 76.87 8.58 9.95 .62 55 


Mean 


343 








ability of the group each year was much the 
same during both semesters. 

[he results of this investigation are on the 
surface not in agreement with those reported 
by Williamson* for the Arts College of the 
University of Minnesota. Williamson found 
the correlation between college aptitude test 


scores and scholarship to decrease from 1928 
to 193 Correction for homogeneity did not 
remedy the situation. Williamson offers the 
uggestion that educational reorganization at 
the University of Minnesota probably ac- 
counts for the decreasing relationship between 
iptitude test scores and university scholarship. 


work is effective, low aptitude 
lents should be guided into courses which 
ite with their abilities. It is 


If personnel 


ire commensiul 
likely, therefore, that such procedures will re- 
duce the relationship between grades and apti- 
test only students with 
high aptitude ratings are permitted to register 


tude scores since 


in the Arts College, it is possible that grade 
standings have not been adjusted to the na- 
ture of the student population. Students who 


in a more heterogeneous group would receive 
above average grades are now probably aver- 
, therefore, as though 
its very nature should oper- 
ate in such a way as to reduce the magnitude 
of the correlation « measuring the 
relationship between and 
college aptitude. 

The conditions at the University of Ne- 
braska are quite different from those prevail- 
ing at the University of Minnesota. No se- 
lective admission standards are in operation, 
ind no educational reorganization has taken 
place. It is true that Teachers College has 
a freshman personnel program, but this has 
not operated in such a way as to reduce the 
heterogeneity of the group. Furthermore, 
this program receives greatest emphasis dur- 
ing the first semester, inasmuch as all fresh- 
men are required to take the orientation 
course at that time. Hence, it appears likely 
that the chance for reducing the magnitudes 
of the coefficients is as great during the first 
semester as it is during the second semester. 


age or below It look 


personne | work by 


oefficients 


university 


grades 


rhe writer is of the opinion that the serious 
economic conditions of the past five years 
have played their part in disturbing the cor- 
relations. In an unreported phase of this in- 
vestigation it was found that the average psy- 
chological test score of those dropping out at 
the close of the first semester is only about 
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ten points below the average of the Teacher: 
College group. Hence, the group remain: 
heterogeneous from the first to the secon 
semester. Unequal motivation and unequa! 
encouragement also probably resulted in vari. 
able conditions which served to reduce th, 
magnitude of the coefficients in some instancy 
while increasing them in others. 

An interesting sidelight of the investigati 
and a possible explanation of the fluctuatioy 
in the size of the correlations from year t 
year is the possible change in freshman inte! 
ligence from 1932 to 1936. Thompson,’ in 
combination questionnaire study and surv 
of the data reported in the Educational Rec 
ord concerning the American Council on E 
cation psychological test scores, found in: 
tions of an increase in freshman intelligenc: 
in the colleges which he included in his inves. 
tigation. 
in a study of the intelligence of freshmen e1 
rolled in the colleges belonging to the Oh 
College Association. Williamson,’ on_ the 
other hand, using the Minnesota College Apti- 
tude Test as the means for measuring t! 
abilities of the students entering the Unive 
sity of Minnesota, failed to find a deci 
change in freshman intelligence for the 
versity as a whole. In certain colleg 
changes were brought about as a result of nev 
admission standards which were put int 
effect in 1932. 

The results of the present study are | 
in harmony with those reported by Willia: 
son. From 1932 to 1935 there was actual 
a decrease in the average score in Form 17 
the Ohio Examination. Since no studies have 
been made of the other colleges in the univer 
sity, it would be difficult to determine wheth« 
the decrease was characteristic only of Tea 
ers College or if it indicated a general tre: 
during the four year period. It happens that 
the average score in Form 18 was correspond 
ingly somewhat higher, if the table of norn 
for Forms 16, 17, 18, and 19 as furnished | 
the Ohio State University is dependabk 
throughout the entire range of scores. It 
may be, however, that the apparent “jum; 
in intelligence is due to the test rather than 
the change in freshman population. 

In view of the fluctuations noted in the 
magnitudes of. the coefficients of correlation 
measuring the relationship between scholar- 
ship and psychological test scores, it seems 
advisable to suggest that a more careful stud) 
should be made of those individual student 





{ 


) 


TEST SCORES 


iVD 


suse the magnitudes of our coefficients 
iry from semester to semester and from 
o year. These cases could be spotted 
mining the scatter diagrams constructed 
e purpose of calculating the coefficients 
elation. By making an intensive study 
se cases it is possible that we might ob- 
information which would enable us to 
better use of psychological tests in per- 
procedures, and thus increase the ef- 
veness of our work. After all it is the 
dual and not the group which is of 
concern to those who are interested in 
idance function of education. 


SUMMARY 


[he magnitudes of the coefficients of corre- 


measuring the relationship between 


» psychological test scores and university 


Teachers College freshmen at the 


es al 


iversity of Nebraska tend to fluctuate from 
ester to semester and from year to year. 
ile a drop in the magnitudes of the 


ients of correlation for the second se- 


VIVERSITY GRADES 

mester might be expected, such was not found 
to be the case in the present investigation. 
The causes of the fluctuations cannot be iden- 
tified through the study of averages and 
standard deviations. A critical study should 
be made of individual students in order to dis- 
cover those factors which cause the magni- 
tudes of the coefficients to fluctuate 
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PHOBIAS AND THE PRESSEY X-O TEST 


N. FRANKLIN STUMP 


fessor of Psychology and Education 


Keuka College, Keuka Park, N.Y. 


Phe number 
indertaken by the use of the Pressey 


of experimental studies which 


1 De 
<—O test! is almost unlimited. The extent 
application of this unique test to the inves- 
igat many interesting problems rests 
irgely upon the ingenuity of the experi- 
ente! Utilization of this test for the dis- 
very of personality traits in individuals has 

rcely made a beginning 

Secti I of the Pressey test measures 
ensiti to unpleasant topics and objects. 
he mm if the test is not explained to 
ect but he is given freedom of oppor- 
ity t eck words pertaining to sex, fear, 
szust nd self-feeling. Section II of the 
test did not seem to have any direct relation- 
Ip t he subject f this paper, phobias, so 


the results on this section were ignored. In 
Section III the subject checks situations 
towards which he holds a disapproving atti- 
In Section IV he indicates topics about 


tuce 
which he has worried, and suggestions con- 
which he has felt nervous. 
Che purpose of this study was to determine 
tne ¢ ( differences mn lest I, Test Ill, 


nd Test IV of the Pressey X—O Test, be- 
tween those subjects with phobias and those 
thout phobias. The significance of the dif- 
erences between these two groups of subjects 
! combined tests was determined. 
[ could be graded on the basis of 
the number of words checked in four different 
fields, in terms of dislikes, namely, sex, fear, 

| self-feeling, all of these elements 
sidered separately as well as in com- 


disgust 
were Cul 


bined 


Before nparisons could be made between 
the tw ips two factors were carefully con- 
sidered t) Are the subjects from one and 
the sa population, differing significantly 
only with respect to the possession or non- 
possession of phobias? (2) Are the phobias 


possessed by the subjects sufficiently potent 
to affect seriously social adjustment in par- 
ticular situations or are the so-called phobias 
only “hypercritical or finicky” feelings which 


are common among the general populat 
a greater or less degree? 

In answer to the first question objective 
results, which were available for the subjects 
seemed the most adequate manner of 
mining the extent of similarity of the groups 
Che results on the American Council Psy 
logical Examination* were available; the 

rather than the 
It was found that the mean percent 


73-9; WIth 


centiles gross scores 
used. 
for the group with phobias was 
phobias 69.6. There was a difference of 

4.3 percentile points between the two groups 
the percentile scores for the group with ; 
bias ranging from 52 to 95; without pl 
from 52 to 98, a range of 43 and 4 
respectively. 

To what extent is a difference of 4 
general ability significant? The standard 
deviations for the groups with and wit 
phobias were 14.60 and 16.48. respecti\ 
the standard errors of the means being 
and 5.80. The standard error of the 
ence was found to be 7.3; the ratio betwee 
the obtained difference of the means and 
standard error of the difference being 
tremely small, namely, .59. Thus the 
of “P” according to “Student’s” tables® is 
tween .6 and .5. The conclusion is that 
difference in general ability between the tv 
groups is imsignificant. 

Results which could be used for the « 
ing of the groups were available fron 
other test. The Allport-Vernon Stud) 
Values* test consists of six parts: Theoret 
cal, Economic, Aesthetic, Social, Political 
Religious. While the two groups did 
make identical average scores on these s 
tions, there was considerable similarity 
tween them throughout the entire test 
some instances, as would be expected, ther 
is a slight advantage for the group with ph 
bias, and in other instances a slight advantag 
for the without-phobias group. 


2 Pub wd by Th American Coun 1 Ed 2 
ington, D. C 
Fisher, R. A.. Statistical Methods for Research W 
Oliver and Boyd, London 
é Dah? : - - wT 











r the section of the Allport test dealing 
theoretical values the group with phobias 
in average score of 25.8; without pho- 
The test-score results on the re- 
ng tests for two groups were as follows: 
ic, 24.3, 27.1; Aesthetic, 33.1, 31.8; 
1.4, 33.0; Political, 28.8, 25; Relig- 
37.95, 38.0. These results show a slight 
ntage, 1.9, in the first test for the group 
hobias; 2.8 increase in second test, for 
ithout-phobia group; 1.3 increase in 
without-phobia group; 1.6 increase 
urth, for without-phobia group; 3.8 in 
for with-phobia group; .o5, in sixth, for 
t-phobla group. 
ie differences in general ability and in 
sense of values would indicate that the sub- 
ire from one and the same population. 


second question relative to the potency 
particular phobias and the extent to 
) they affected the social adjustment in 
n situations seemed of major importance 


fore comparisons could be accepted. Of 
irse, the phobias are of varying degrees of 
trength in different individuals, but all of the 


ears seemed sufficiently serious to be classi- 
| as real phobias. Because of the variety 
hobias possessed by the group, it will be 
ssible to give even brief case histories 

f them. A few cases must suffice. 


Feather Phobia 
The subject cannot recall the reason for 
extreme fear of feathers. She stated 
that she was not born with this fear and did 
t develop it in very early childhood, for 
her parents can recall of her having made 
pets of chickens to the extent of calling 
m by name and playing with them as a 
child might with a doll. However, she re- 
calls none of this; her only recollection is 
that she was dreadfully afraid of chickens, 
birds of all kinds, and all feathers. Living 
r dead birds frighten her and feathers of 
iny kind are instruments of torture to her. 
Her older brothers often made use of this 
knowledge and made her do their bidding 
by the mere threat of touching her with a 
feather. Serious lapses in friendship have 
resulted from individuals trying to scare her 
with feathers. 


writer indebted to Miss Pogoda for the case his- 

n this study. She is believed to be thoroughly compe- 

the recognition of real phobias because she possesses an 

me unnatural fear in a specialized field, and is therefore 

able perhaps to recognize what should be regarded as a 
bia in others 
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Her mother, at some time, forced her in- 
to a room and put a pet canary up to her 
cheek in an attempt to cure her of this 
fear; but she became terrified, grew pale, 
screamed, and trembled uncontrollably. 

She actually avoids touching pictures of 
birds or feathers in books and never wears 
feathers in her hat — even stiff, artificial 
ones. She understands that no physical 
harm can come to her through contact with 
a feather, but she cannot make herself 
touch one. She has tried and has succeeded 
for a moment to hold a tiny, white feather 
in her hand, but, as soon as she realized 
what she was doing, she dropped it shudder- 
ing involuntarily. 

At college, she had an unpleasant experi- 
ence with a director who insisted that she 
wear a cloak decorated with feathers in a 
play. She insisted that she could not, and 
had the feathers torn off the cloak before 
she could possibly wear it. 

No other fears as of bugs or mice, disturb 
her, but her fear of feathers is amazingly 
acute. She would rather be placed in a 
cage with lions, she says, than in a cage 
with chickens. 


Spider-and-Millers Phobia 

From earliest recollections the subject 
was exceedingly fearful of spiders and mil- 
lers. She has no recollection of the origin 
of the fear but merely remembers always 
being afraid of spiders and millers. 

One of the first incidents in her experi- 
ence with this fear was at a time when she 
was seven years old. She was spending a 
summer at a camp in which there were 
many cobwebs and spiders. Her fear was 
so great that she slept with covers pulled 
over her head and awoke in a cold sweat 

On other occasions she would refrain 
from opening windows at night (though she 
is an advocate of fresh air in great abund- 
ance) merely because a spider web was 
somewhere in the vicinity and the possibil- 
ity of millers flying in through the window 
was great. She never can summon enough 
courage to kill a spider or miller, but rather 
shrinks back in utter terror. Many un- 
pleasant situations and cruelties have re- 
sulted from this unnatural fear—such as: 
(1) The subject’s being chased by an older 
sister who pursued her with a spider held 
menacingly forth, (2) A college student 
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placing spiders near her belongings or pur- 
suing her with them. 

This fear she claims is a real and vital 
one, far greater than the average individ- 
tals understandable squeamishness toward 


bugs and mice. 


Case histories have been prepared for each 
of the subjects, all college students, who pos- 
sessed an extreme unnatural fear which, at 
times, greatly affected her social adjustment, 
but space will not permit a presentation ot 
them. The phobias of the subjects were: 
falling from high places, being chased (just 
for fun), spiders, millers, bridges, sharp 
points, addressing audiences, fire, deep water, 
and feathers. 

Still another instance will show the unnat- 
ural degree of these fears. The individual 
who is extremely afraid of addressing audi- 
ences has caused no end of worry for herself 
during the past year and a haif in college. In 
courses requiring oral reports before the class 
she has made special arrangements with pro- 
fessors to avoid speaking before the group. 
It is believed, by some instructors, that she 
would discontinue her college course rather 
than make an oral report. This is not due to 
any lack of ability to offer a satisfactory re- 
port but due to an absolute loss of emotional 
control when facing groups during a formal 
report. One of the professors inquired of her 
what she thought would happen if she were 
compelled to make an oral report before the 
class group. She replied that she believes 
that she would lose all consciousness and 
would perhaps fall to the floor in a faint. 


RESULTS 

Table I presents the means, standard devi- 
ations, standard errors of the means, standard 
errors of the differences between the means 
(for subjects with and those without pho- 
bias), the ratios of the actual differences be- 
tween the means and the standard errors of 
the differences, and the extent of the signifi- 
cance of the actual differences as determined 
from tables of “Student’s” distribution. 

Test I (affectivity scores) seems to ap- 
proach absolutely significant differences in 
discriminating between the subjects with and 
without phobias. The subjects without pho- 
bias dislike fewer objects and things. These 
subjects are also less sensitive to and express 
less dislike for words pertaining to sex, fear, 
disgust, and self-feeling. Whether there are, 
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however, significant differences between th, 
subjects in all these respects is described ; 
the following paragraphs. 

Test I (affectivity score) dealing with up- 
pleasant objects and topics, and “disgust 
words in Test I have “P” values between o- 
and .o2 and, therefore, the means show sig- 
nificant differences between the subjects with 
and those without phobias. The ratios be- 
tween the actual differences of the means and 
the standard error of the differences are idep- 
tical, namely, 2.40. It is interesting to not; 
that these two measures are far more signifi- 
cant in describing the differences between 
these two groups of subjects than are the 
combined affectivity scores on Tests [, III. 
and IV of the Pressey test. This may point 
to the necessity of increasing the repertoire 
of objects and things which are satisfying t 
those individuals who possess phobias. Since: 
many phobia cases offer rather stubborn prot 
lems before the subjects can finally conquer 
their difficulties, unpleasant attitudes, it ay 
pears, must be subdued by pleasant ideas an 
a satisfying feeling-tone. The effectiveness o! 
Test I (affectivity score), in revealing signifi- 
cant differences between the phobia and non- 
phobia groups is verified, therefore, by the 
“disgust” words in Test I which also show 
significant differences. 

The “self-feeling” words in Test I and th 
total affectivity scores on Tests I, III, and IV 
do not show significant differences betwee: 
the phobia and non-phobia groups. It seems 
that self-feeling does not intensify highly spe- 
cialized unnatural fears; however, this ques- 
tion should be attacked by further experi- 
mentation. 

The “fear” words in Test I, and topics 
which cause the subjects worries or make them 
nervous in Test IV, show no significant differ- 
ences between the two groups of subjects. 
The lack of demonstrative differences between 
the subjects on the “fear” words in Test I may 
be due to the fact that phobias are extremely 
specialized fears, i.e., an individual may have 
one extreme fear and yet not fear the ordi- 
nary things generally held as disturbing to 
the majority. 

“Sex” words in Test I show no significant 
differences between the means of the two 
groups. This is what would be expected 
Furthermore, mere disapproval of certain sub- 
jects, Test III, does not show significant dif- 
ferences. It appears that the dissatisfying. 
unpleasant, disgusting situations must be suf- 
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TABLE | 

WsirA AND THOSE WirHourT PHOBIAS COMPARED WITH RESPECT TO MEANS, STANDARD 
DEVIATIONS, STANDARD ERRORS OF THE MEANS, STANDARD ERRORS OF THE DIFFERENCES 
BETWEEN THE MEANS, THE RATIOS OF THE ACTUAL DIFFERENCES BETWEEN THE MEANS AND 


rA. 1S 


STANDARD ERRORS OF THE DIFFERENCES, AND THE VALUE OF “P” ACCORDING TO 


“STUDENT'S” 


TABLES. Wrry—N=10; WirHovr—N = 10 
D P 
Tests Phobias M tu é e between 
T affectivity scores on Test I, With 137.8 13.90 1.63 
III, and IV, of Pressey X-0 test Without 121.6 20.97 6.99 8.38 1.93 1&.05 
[—unpleasant objects and With 33.2 7.37 2.46 
Ee atime mena Without 26.6 3.72 1.2 2.75 2.40 .05 & .02 
II1I—subjects check anything With 61.8 8.06 2.69 
disapprove —_-----. _._. Without 59.1 15.68 9.23 9.88 59 .7&.6 
IV—topics which cause sub- 
s worries or make them With 42.8 6.89 2.30 
é ae aiacdl ated poked Without 35.9 12.26 1.09 1.69 1.47 2&.1 
Sex words in Test I - _...... With 6.9 3.73 1.24 
Without 6.3 2.05 68 1.41 426 .7&.6 
ds in Test I -- a, 6.2 3.37 1.12 
Without 4.2 1.47 AY 1.22 1.64 2&.1 
vords in Test I _- . With 13.5 1.43 48 
Without 11.6 1.90 .63 79 2.40 .05 & .02 
S é ec words in Test I _- With 6.7 3.95 1.32 
Without 3.7 2.41 80 1.54 1.95 1 & .05 
ently intense before significant differences cure can be made. This approach may war- 


tween the phobia and non-phobia group are 
is was the case in the affectivity 


gisteret 


re in Test I and the “disgust”? words in 
CONCLUSIONS 
\ large affectivity score on Test I of the 


Pressey X—O test, and a large number of “dis- 
st’ words checked by the subjects in Test I 

significant difference between those 
persons with and those without phobias. It 
may suggest that psychiatrists may be re- 
juired to reduce the number of situations, 
generally, which are dissatisfying, disgusting, 
to those with phobias before a permanent 


W 





rant special experimentation. 

2. To set to work consciously to build up a 
large repertoire of situations which are satis- 
fying may be effective in crowding out a 
highly specialized fear which has reached the 
phobia stage. 

3. There was no significant difference be- 
tween the phobia and non-phobia groups with 
regard to the number of “‘self-feeling’’, “fear’’, 
or “sex” words which were marked by the 
subjects. In each of these instances the with- 
phobia group marked a larger number of 
words, but the increase was not significantly 
large. 








A STUDY IN THE PREDICTION OF COLLEGE 
FRESHMAN MARKS* 
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Enid, Oklahoma 


I. INTRODUCTION 


Colleges have long been interested in the 
prediction of student scholarship, especially 
the minimum degree of scholarship required 
by the institution for graduation. This atti- 
tude is reflected in many kinds of require- 
ments, nearly all of which have had as their 
primary aim the forecasting of the ability of 
the student to succeed in the institution, so 
far as academic achievement is concerned. 

It had been assumed, until recent years, 
that evidence of success in the secondary 
school was sufficient indication of ability to 
succeed in college or university. This was a 
more logical attitude to take in times past, 
when Greek, Latin, and religion constituted 
the major offerings in both secondary school 
and college. However, as curricular offerings 
expanded and diversified in both secondary 
and higher institutions of learning, it was ob- 
served that many who were admitted to col- 
lege failed, that previous education was often 
inadequate, and that the correlation between 
success in secondary education and in higher 
education was low. These observations, 
coupled with the fact that either law or pub- 
lic opinion forces public institutions of higher 
learning to admit all high school graduates, 
have led to the use of many measures, the 
purpose of which is to evaluate the ability of 
the freshman after he has been admitted. 

Recent trends, in general, have been from 
traditional and subjective criteria of freshman 
ability to experimental and objective meas- 
ures. Among the measures used are: high 
school marks, mental tests, aptitude tests, 
achievement tests, character tests, and the pat- 
tern of high school subjects. The first two 
mentioned are by far the most frequently 
used of the measures listed. 

The results, on the whole, have been disap- 
pointing, largely because they were true of the 
group as a whole but did not apply to indi- 
viduals or to types of students, with any de- 
a the f the Ph.D 
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termined degree of definiteness. The findin 
did not reveal what was happening to the bo, 
as distinguished from the girl, or to the 
bright student as compared with the dull. 

Studies have been limited, in nearly al] 
cases, to one or two predictive factors ji 
given institution. Procedure has c 
for the most part, in running a simple corre- 
lation between the predictive factor and « 
lege marks, the authors failing to validate 
their predictions by concrete application t 
the student body. 

Further development of the problem will 
doubtless be influenced by a number of fac- 
tors, including the philosophy of education on 
the college and university levels, further 
growth of enrollments in colleges and univer- 
sities, with accompanying developments in 
personnel work, conditions and differentia- 
tions of employment among the educated 
classes, further developments and refinements 
in the science of testing, including character 
traits, and the development of more reliable 
means of determining high school and college 
marks. 

Because of curricula varying both in con- 
tent and difficulty in different institutions of 
higher education, varying medians of ability 
in freshman classes in different colleges and 
universities, varying methods of teaching and 
standards of achievement in different institu- 
tions of higher learning, wide differences in 
faculty, buildings, and equipment of different 
institutions, and many other variables too 
numerous to mention here, no uniform form- 
ulae or criteria can be prescribed for the eval- 
uation of ability of freshmen to succeed in all 
colleges and universities. Each institution 
must, in a measure, settle the problem for it- 
self. Probably the best scientific attack on 
the problem of prediction is to select an insti- 
tution of higher learning as nearly typical of 
a general class as possible and study inten- 
sively a number of predictive factors in that 
institution as a basis for the evaluation of 
the ability of entering freshmen to succeed in 
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ai] similar institutions. That is the plan of 
this study, and Phillips University is the in- 
stitution chosen for the investigation. 

The agents used for prediction in the study 
are: (a) The American Council Test, as a 
measure of intelligence; (b) The Ohio State 
University Psychological Examination, as a 
measure of intelligence; (c) high school 
marks, as a complex measure, including intel- 
igence, academic achievement, character 
traits, and perhaps other factors; (d) first 
semester college marks, used both as the thing 
predicted by the other agents, (a), (b), (c), 
and (d) and as itself an agent for the predic- 
tion of second semester marks; and (e) the 
Purdue Placement Test in English, as a check 
n the value of special aptitude tests in pre- 
dicting marks in a special field. 

No attempt is made to justify, as predictive 
ents, any of the measures used, or to in- 
crease the validity or reliability of any of 
them. The purpose of the study is to deter- 
mine, as far as the techniques employed are 
capable of determining, the amount of pre- 
dictive value in forecasting college marks that 
these measures have, individually and collec- 
tively, theoretically and practically, without 
reference to the question as to whether they 
should have more of such value, and also 
without reference to the means of increasing 
said value. 

The investigation was made in Phillips Uni- 
versity, because in all its departments it is 
predominantly a liberal arts school, the type 
in which most of the measures used have been 
validated; it has a cosmopolitan student 
body; and the data for the study were read- 
ily accessible to the author. 

rhe sources of data used in the study are: 
1) the scores resulting from giving the Ohio 
State University Psychological Examination, 
Form 18, to freshmen at Phillips University in 
December, 1934; (2) the scores resulting from 
giving the American Council Test to Phillips 
University Freshmen, in September, 1934; 
(3) scores resulting from giving the Purdue 
Placement Test in English to freshmen at 
Phillips University, in September, 1934; (4) 
the transcript of high school marks for each 
freshman who had Ohio State University Psy- 

chological Examination scores, American 
Council Test scores, and Purdue Placement 
Test in English scores on file in the registrar’s 
office for the year 1934-1935; and (5) marks 
received during the first and second semesters 
Ol 1934-1935 in Phillips University, by each 
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student in the study. One hundred forty 
freshmen, seventy-five boys and _ sixty-five 
girls, were found to have complete records in 
these five measures. 

From these data eighty-six simple correla- 
tions were calculated as a basis of direct com- 
parisons, as well as multiple correlations and 
regression equations needed in the study. 
These interrelationships are shown in Table I. 

The value of the measures used in predict- 
ing marks in specific subjects or subject-fields 
is sought by the use of the differential corre- 
lation formula (recently developed and vali- 
dated by Segel) , comparing results with actual 
marks obtained by the students. The Criti- 
cal Point, that is, the mark below which a 
student may not fall and still be likely to suc- 
ceed, academically, at Phillips University, is 
determined. 


II. LIMITATIONS AND RELIABILITY OF DATA 


College marks are used as the basis of many 
decisions of great importance to students. 
They are used to forecast his further success 
as a college student; to determine his fitness 
to engage in certain occupations, his eligibility 
to honor societies and athletic activities, his 
merits in contests for prizes and scholarships, 
his qualifications to pursue specific courses of 
study; and to serve as a basis on which to 
predict his general success in life. Since the 
student’s destiny so largely depends upon his 
college marks, and they in turn are largely 
determined by the total situation at a given 
school, factors influencing marks at that 
school should be carefully studied as a basis 
for predicting, before or soon after the student 
enters the school, as nearly as possible what 
the student’s marks will be. With this in 
mind the situation at Phillips University is 
presented in the following paragraphs to serve 
as a background for this study. 

Phillips University uses a typical five-point 
marking system as follows: S (superior) is 
given to about the upper ten per cent of the 
class; G (good) is given to the next lower 
twenty-five per cent; M (medium) is given to 
the next thirty-five per cent; I (inferior) is 
given to the lowest fifteen per cent of the class 
passing; U (unfinished) satisfactory work, or 
C (conditioned) unsatisfactory work, or F 
(failure) is given to those not passing the 
course. A mark of S earns for the student 
three honor or credit points for each semester 
hour of credit received in the course; a G 








earns two; M, 1; I, 0; U, C, and F—t1 each. 
The number of credit points earned divided 
by the number of semester hours of enroll- 
ment equals the scholarship quotient of the 
student. A scholarship quotient of 1 is re- 
quired for graduation, and a student with a 
smaller quotient is on probation until such a 
time as his quotient equals or excels 1, or un- 
til the student withdraws from school. 

For the purposes of this study the literal 
symbols of the marking system were quanti- 
fied by assigning a value of 5 to S, 4 to G, 
2 to M, 2 to I, and 1 to U, C, or F. This 
was done to avoid zero and negative quanti- 
ties in the statistical treatment of the data. 
Literal marks and their equivalents on high 
school transcripts were quantified in the same 
way. 

Fields and subjects in which college marks 
were used in this study are English (compo- 
sition and rhetoric); mathematics (algebra, 
trigonometry, solid geometry, and analytical 
geometry); science (physics, chemistry, biol- 
ogy and geology); foreign language (French, 
Spanish, German, Latin and Greek); and his- 
tory and social science (American history and 
government, European history, and psychol- 
ogy). Courses in these subjects are, for the 
most part, continuous through the freshman 
year, being, in fact, year subjects rather than 
semester subjects. In 1934-1935 ninety-six 
per cent of the students in English I who re- 
mained through the year took English II, and 
ninety-one per cent of these had the same 
teacher they had the first semester. Seventy- 
six per cent of the students in mathematics 
(algebra-trigonometry ) the first semester took 
algebra-analytics, or algebra-solid geometry 
the second semester and eighty-five per cent 
of these had the same teacher the second 
semester. Eighty-seven per cent of freshmen 
students in foreign language the first semes- 
ter continued the same subject the second 
semester with ninety-two per cent having the 
same teacher. Ninety-four per cent of fresh- 
men in science continued the same subject the 
second semester with ninety-five per cent hav- 
ing the same teacher. Fifty-nine per cent of 
freshmen in social science continued the same 
subject the second semester, and eighty-four 
per cent made no change in teachers. 

The subjects used in the study were fresh- 
men at Phillips University for the year 1934— 
1935. One hundred forty students were 
found with complete records in the measures 
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used in the study. More could have been 
included if fewer measures had been employed 
It was thought, however, that an intensive 
study of a smaller number would be prefer. 
able to a superficial investigation of a larger 
group. The size of the group, beyond the 
number required to give a fairly smooth curve 
of distribution, is important only to the extent 
that it reduces the sampling errors. The de- 
gree of influence that larger numbers might 
have in an investigation, granting that the 
group used is representative, may be illus- 
trated by reference to the probable error of 
the mean of scores made on the Ohio State 
University Psychological Test by the group 
used in this study. The mean is 72.84, and 
the probable error of the mean is 1.18. Now 
in order to cut this error in half (to .59) by 
increasing the size of the group, N (140 
must be multiplied by 4, that is 560 cases 
must be included in the study (4, p. 125) 
If we wish to cut the error to one-eighth of its 
size (to .15), N (140) must be multiplied by 
64, that is, 8,960 cases must be included in 
the study. 

The normalcy of the curve of distributio: 
of the Ohio State University Psychological 
Examination scores received by the students 
of this study, as shown by the superposition 
of the normal curve upon it, provides evidence 
that the group used is adequately representa- 
tive of the larger group of college freshmen 
A technique developed by Dickey (2, p. 439 
was used to reveal this relationship. G (nor- 
malcy) is found to be .go + .o171. The G 
of 1255 University of Oklahoma Freshmen on 
the same test given at the same time is 
.go + .008. 

The writer is aware that the reliability of 
college marks is a crucial factor in the final 
solution of the problem under consideration. 
The reliability of semester marks at Phillips 
University is unknown, and at present, in- 
capable of being determined with any high 
degree of accuracy. This is so, chiefly, be- 
cause the teachers have no common basis on 
which to assign the marks. They also have 
few objective standards for the evaluation of 
the marks assigned. An inquiry was made 
among the teachers of Phillips University to 
determine the factors considered by them in 
assigning semester marks. It revealed the 
fact that many other things besides academic 
achievement are considered in making up the 
students’ marks. Thirteen different factors 
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re used. Three teachers considered all these 
factors, and one considered only three, eleven 
being the median number employed. There 
were no two teachers who used the same set 
{ factors with anything like the same degree 
f emphasis on corresponding factors, nor was 
there agreement among the faculty on the rel- 
itive importance of any single factor. The 
situation is quite chaotic. 

\ study of teachers’ marks based upon sec- 
nd semester examinations at Phillips Univer- 
sity for 1934-35 showed a reliability coeffi- 
[ .073. This coefficient was 
ised for validation of results obtained in the 
study. 

On the basis of data given by Ruch (9, p. 
107) and (10, p. 55) and Symonds (14, p. 
sor), a reliability coefficient estimated at .60 
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validation. Published reliability coefficients 
for the other measures used in the study are: 
Ohio State University Psychological Examina- 
tion .92 + .003, (7, p. 2048.1); American 
Council Test .95 (13, p. 135); Purdue Place- 
ment Test in English .95 (8, p. 3). 

The much higher reliability of the stand- 
irdized tests as compared to high school and 

llege marks as presented here is especially 
noticeable. This situation will probably con- 
tinue to be so until there is better agreement 
among teachers as to what should determine 
course marks, and until there is a more uni- 
form standard for evaluating the factors in- 
cl marks. The importance of this 
wide difference in the reliability of the meas- 
ures used in this study will become manifest 
in a later section of the investigation. 
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III. SratisticAL ANALYSIS OF 
RELATIONSHIPS 


Intercorrelations among measures used in 
the study are shown on Table I by the numer- 
als one to sixteen along the rows, and two to 
sixteen along the columns, these figures being 
employed to identify measures used in regres- 
sion equations. The first number in each cell 
in the body of the table is r, the middle one is 
the probable error of r, and the lower one is r 
corrected for attenuation. 

A brief study of Table I reveals the fact 
that of the first three measures, which are 
used as predictive agents only, the Ohio State 
University Psychological Examination scores 
yield the highest correlation with the first 
semester average of college freshman marks 
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(.522), high school average marks the second 
(.516) and American Council Test scores the 
third (.408). The difference between .522 
and .408 is not statistically significant, but it 
is enough to indicate some probability (ninety 
chances in a hundred) that a true difference 
exists in the direction here indicated. The 
difference (.114) is .475 as large as it should 
be to be entirely significant. The average of 
these correlations is .482. When correlated 
with second semester average of college marks 
(column five) these three measures rank, high 
school average first (.597), Ohio State Univer- 
sity Psychological Examination scores second 
(.557) and American Council Test scores 
third again (.412), the average of the three 
correlations being .522. The correlation of 
the same three measures with first semester 
college freshman marks in five specific subject 
fields shows that the Ohio State University 
Psychological Examination scores rank first 
with an average r of .473; high school aver- 
age marks rank second with .385; and Amer- 
ican Council Test scores rank third with an r 
of .367, the average of the three being .410. 
The rank order of these three measures is the 
same when they are correlated with second 
semester college marks in the same five sub- 
ject fields. 

The average correlations of these three 
measures with each of the subject fields in- 
cluded in the study are as follows: First 
semester, mathematics .553, English .440, so- 
cial science .362, science .353, and foreign lan- 
guage .346; second semester, mathematics 
.764, English .452, social science .437, science 
.331 and foreign language .371. It is thus 
seen that the average correlations of these 
measures with marks in the specific subject 
fields have the same rank order the second 
semester as the first, except that science and 
foreign language shift ranks at the lower end 
of the series. These findings would seem to 
indicate that there is a real difference in the 
ability of the Ohio State University Psycho- 
logical Examination, high school average 
marks and American Council Test to predict 
marks in the different college freshman fields 
included in this study, when these correla- 
tions are averaged. 

Correlation of first semester college fresh- 
man marks with second semester college fresh- 
man marks (.784) is much higher than any 
of the other correlations in Table I involving 
general scholarship. This is probably due 
chiefly to the fact that most freshmen at Phil- 
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ips University, as is probably true also in 
most small typical liberal arts colleges, con- 
tinue their first semester courses through the 
second semester without change of teachers 
with little change in class personnel. 

[his situation makes the first semester aver- 
age of freshman marks at Phillips University 
, valuable agent for the prediction of second 
semester marks. However, in spite of its ad- 

yntage in respect to general scholarship, it 

s but little superior to Ohio State University 
Psychological Examination or high school av- 
ige marks for the prediction of second 
mester marks in the separate college subject 

ls included in this study, as can be ob- 
ved in Table I, rows one, three and four, 
lumns eleven to fifteen inclusive. 

[he average correlation of high school 
marks in subject fields with college marks in 
the corresponding fields for the first semester 

482; average of first column in Table IT), 
is the same as that of mental tests and high 
school marks with average college freshman 
marks for the first semester (.482). This 
might seem to lend some weight to the re- 
cently proposed and rapidly developing hy- 
pothesis that given amounts of academic prep- 
aration in specified high school subject fields, 
as a prerequisite to college entrance, are not 

ssential to college success (Douglas p. 
Caution against such conclusion, how- 
ever, is presented in the fact that the correla- 
tion of high school marks in the separate sub- 
ect fields with second semester college fresh- 
man marks in the same subject fields (.590; 
average of the fourth column in Table II.), 

nearly .o7 higher than is the average of 
mental tests and high school average with 
second semester average of college marks, 
which is .522, as indicated above. This fact, 
unless accounted for in some way not now ap- 
parent to the writer, would tend to prolong 
the traditional belief in the efficacy of high 
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FRESHMAN MARKS 
school preparation in specified subject fields 
to enable the student to achieve college suc- 
cess in those fields. 

From the correlations calculated by a 
varied combination of the measures used in 
the study, the following were selected as prob- 
ably the most efficient for presenting the dif- 
ferent phases of prediction attempted in gen- 
eral scholarship: 

1. The multiple correlation of first semester 
college marks with American Council Test 
scores and high school average marks is .590, 
with a probable error of estimate of 445; 
and the regression equation is X, 004.X, 

— .566X, -: 

2. The multiple correlation of first semes- 
ter college marks with Ohio State University 
Psychological Examination scores and high 
school average marks is .605, with a probable 
error of estimate of + .436; and the regres- 
sion equation is X 014X, — .500X, 
.249 

3. The multiple correlation of first semes- 
ter college marks with Ohio State University 
Psychological Examination scores and Ameri- 
can Council Test scores and high school aver- 
age marks and English Aptitude Test scores 
is .621, with a probable error of estimate of 
and the regression equation is X, 


) 
i 
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425; 
o1X, — .0003X, + .428X, + .0005X,, 
040. 


4. The multiple correlation of second se- 
mester college marks with first semester col- 
lege marks and high school average marks is 
.829, with a probable error of estimate of 
+ .295; and the regression equation is X, 
59OX, + .445X, — .270. 

5. The multiple correlation of second se- 
mester college marks with first semester col- 
lege marks and Ohio State University Psy- 
chological Examination scores is .802, with a 
probable error of estimate of + .316; and the 


TABLE II 


SUBJECT-WITH-SUBJECT CORRELATIONS OF HIGH SCHOOL MARKS WITH COLLEGE FRESHMAN 
MARKS, WITH PROBABLE ERRORS AND CORRECTIONS FOR ATTENUATION 


Prob- Corrected Prob- Corrected 

Corre- able correla- Corre- able correla- 
Subject lation error tion lation error tion 
(First Semester) (Second Semester) 
IN ‘seititaeihiscipicitgiinietscai _ .541 033 703 551 031 735 
Mathematics - nt _ 439 .096 954 804 044 1.340 
Science .....- aa 5 ala a oe 041 .720 .296 .061 .406 
Re MUON foci, a macpnusioanie _ 477 051 636 637 046 671 
i eee ee eee .067 786 .664 .048 1.210 
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regression equation is X 671X, + .008X, marks with first semester college marks js 

<8 .784, with a probable error of + .o21, and a 

6. The multiple correlation of second se- probable error of estimate of + .219; and the 
mester college marks with first semester col- regression equation is X, = .765X, + .774. 

lege marks and high school average marks First semester marks were predicted by the 


and Ohio State University Psychological Ex- regression equations in 1, 2, 3, 7, 8, and g 
amination scores and English Aptitude Test and second semester marks by the regression 
cores is .836, with a probable error of esti- equations in 4, 5, 6, and ro. 


mate of .292; and the regression equa- The regression equation was used to predict 
tion is .516X, 418X, + .003X, + the average of semester college marks for ea 
5X .235. student. These marks were correlated with 


7. The correlation of first semester college achieved marks, and the coefficient compared 
marks with Ohio State University Psychologi- with the correlation coefficient in the regres- 
cal Examination scores is .522, with a prob- sion situation by which the marks were pre- 


ible error of 41 and a probable error of dicted. Means and sigmas of predicted marks 
estimate of + .319; and the regression equa- were calculated and compared with those 
tion is X o21X 1.634 achieved marks. The predicted mark of eac! 


8. The correlation of first semester college student was compared with his achieved mar} 
marks with American Council Test scores is and the difference found. The sum of thes¢ 


408, with a probable error of +.044, and a__ was divided by N to get the average difference 
probable error of estimate of + .333; and between predicted marks and achieved marks 
the regression equation is X,==.007X,+ The range of difference was found by noting 
2.031 the lowest difference, as well as the highest 


9. The correlation of first semester college between the predicted mark and the achieve 
marks with high school average marks is .516, mark of any student in the study. All these 
with a probable error of + .042, anda prob-_ calculations were made for each of the re- 
able error of estimate of + .362; and the re- gression equations in situations 1 to ro, i 
gression equation is X, == .697X, + .45§1. clusive. The results of these operations are 

ro. Correlation of second semester college shown in Table III. 


TABLE III 


REGRESSION SITUATIONS, MEANS AND SIGMAS, AVERAGE DEVIATION AND RANGE OF DIFFEREN 
AND CORRELATIONS OF PREDICTED AND ACHIEVED MARKS, ALSO THE CORRELATIONS 
OF THE REGRESSION SITUATIONS 


Regres Diff. in Range of r (predicted 
Y Mean Sigma Grade Pts Diff. & Ach. Marks) R 
3.19" 81 
3.15 49 492 0-1.54 593 + .035 990 
6 3.19 RI 
3.18 53 480 0-2.41 .613 + .034 605 
3.19 81 
3.19 .60 437 0-1.49 .618 + .034 21 
i 3.33 .79 
3.37 .§2 341 0-1.76 .834 + .021 829 
5 3.33 .79 
3.32 67 320 0-2.15 .821 + .022 802 
6 3.33 .79 
3.37 66 314 0-1.80 842 = .020 =a 
7 3.1! 81 BP 
3.17 63 520 0—2.67 .538 + .040 522 
8 3.19 31 a 
3.18 61 555 0—2.82 .413 + .044 .408 
a 3.19 381 rae 
3.14 65 513 0-2.12 .508 + .042 O16 
10 3.33 A p 
3.29 64 .335 0-2.01 .791 + .023 134 
*These regression situations are described in numbered paragraphs in the text. ” 


* Figures above refer to achieved marks, and those below refer to predicted marks, in 
columns of means and sigmas. 
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\ brief examination of Table III reveals 
the fact that the means of predicted and 
achieved marks are approximately equal, the 
largest variation between the two being .oso 
fa grade point, which occurs in the case of 
regression situation (9). Sigmas of predicted 
marks are uniformly smaller than those of 
achieved marks, the greatest variation being 
> of a grade point, which is in the case of 
regression equation (1). The smallest dif- 
ference in sigmas is seen to be .120 of a grade 
int, in the case of regression equation (5). 
Predicted marks were usually lower than 
achieved marks in the upper parts of the dis- 
tribution, whereas in the lower parts the re- 
verse was true. This, of course, accounts for 
the fact that sigma of predicted marks was 
smaller than that of achieved marks. 

The average difference between predicted 
marks and achieved marks ranged from .314 
1 grade point (about one-third of a grade- 
symbol) in the case of correlation (6) to .555 
fa grade point (a little over half of a grade- 
symbol) in the case of regression equation 
8). The range of difference was greatest in 
equation (7), being 2.67 grade points, and 
least in equation (3) with 1.49 grade points 
The correlation between predicted marks and 
achieved marks was, in each case, approxi- 
mately the same as the correlation in the re- 
gression situation used as the basis of the 
predicted marks. For example, the correla- 
tion in (1) is .590 as compared with .593 for 
the correlation of predicted marks with 
achieved marks. 

It may be noted that prediction of first 
semester marks by the use of regression equa- 
tion (3) yields a smaller average difference 
from achieved marks (.437 of a grade point) 
than any other correlation used to predict first 
semester marks. When second semester 
achievement marks are placed alongside first 
semester achievement marks, and each mark 
predicted by (3) is compared to the corre- 
sponding semester mark (achieved) that is 
nearest in size to it, the average error of pre- 
diction is cut from .437 of a grade point to 
.356 of a grade point; and when the ten per 
cent most widely variant cases are eliminated, 
this error is cut to .261 of a grade point for 
the remaining ninety per cent of the students 
in the study, with a range of o to .96 of a 
grade point. 

When second semester marks are predicted 
from first semester marks, or a multiple corre- 
lation including these marks, the average dif- 
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ference between predicted marks and achieved 
marks is much smaller than it is in the other 
predictions. This is shown in Table III, re- 
gressions 4, 5, 6, and 10. The most efficient 
of these regressions is (6), and it is a five- 
variable equation, just as (3), a five-variable 
correlation, is the most effective combination 
of the measures used in the study for the 
prediction of first semester marks. When 
the ten per cent most widely variant cases 
are eliminated the error of prediction with 
regression equation (6) is cut to .217 of a 
grade point for the remaining ninety per cent 
of the students, with a range of o — .60 of a 
grade point. 

The findings in Table III are true when ap- 
plied to the 1934-1935 Phillips University 
Freshman Class as a whole, but they give no 
hint as to what is happening in the different 
levels of the distribution. 

In order to show how efficiently the marks 
of students of varying abilities can be pre- 
dicted by the use of regression equations (1) 
to (10) a series of ten tables showing the 
decile rank of predicted marks as compared 
with achieved marks was prepared. Tables 
IV and V, with accompanying analyses, are 
here presented as an illustration of how the 
data were treated in these ten tables. Tables 
VI and VII are a summary of findings in the 
ten tables. 

Table IV, presenting predictive results ob- 
tained by use of regression (3) compared with 
achieved marks, is presented to illustrate the 
efficiency of this technique in predicting first 
semester college freshman marks. Table V 
deals similarly with second semester marks. 
Regression equation (2) is chosen for analysis 
because it is the result of the best combina- 
tion of variables discovered in the study for 
the prediction of first semester marks. 

Both predicted and achieved marks were 
ranked individually from highest to lowest. 
These marks were then blocked into deciles 
and placed in the table. The individual stu- 
dents are represented by numerals 1 to 140. 
This was done in order to identify any given 
student in the distribution of achieved marks 
as compared with predicted marks. It also 
enables one to study the relative predictive 
efficiency of the various combinations of 
measures used, as applied to the good stu- 
dent: c: ihe poor, to male or female (the un- 
ders-orei numbers are girls), as well as to 
give a check-up on the relative predictive effi- 
ciency of a given measure or group of meas- 
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TABLE IV 


Dect 
1934-1935, BASED UPON THE CORRELATION 


VERSITY PSYCHOLOGICAL Test Scores + HIGH SCHOOL 2 v ' 
.621, COMPARED WITH CORRESPOND 


Test Scores + ENGLISH APTITUDE TEST SCORES) 
ING RANKS ACHIEVED 
Predicted Rank 


Deciles 
5 00 1, 5, 20, 24, 26, 45, 136, 139 
131 ‘ 137, 27, 58, 68, 87, 88, 94" 
1.30 7, 2, 19, 52, 59, 65, 114 
3. 116, i, £28 208, 288 
39 13, 25, 60, , 140, 129 
7. 1, 32, 51, 57, 9, 70, 105 
3 56 14, 80, 99, 106, 96, 126, 134 
2 . 6, 30. 54, 55, 117, 110, 75 
' 
7 1, 21, 64, 74, 101, 124, 127 
6 
22, 44, 28, 61, 77, 86 
12 4, 39, 92, 85, 102, 83, 15, 18 
294 23, 48, 120, 66, 72, 104, 109 
Q 29, 108, 112, 121, 130, 67, 43 
2.79 eee 
. 35, 36, 41, 46,  Raey Lae 
2.50 17, 50, 63, 76 fA 8, 111 
19 49, 93, 95, 107, 128, 84, 133 
217. 38, 69, 91, 118, 132, 138, 42 
2.16 3, 8, 10, 12, 40, 47, 56, 71 
1.00 _ _-. 100, 125, 89, 103, 128, 82 


ue RANKS OF PREDICTED First SEMESTER MARKS OF 140 PHILLIPS UNIVERSITY FRESHMED 
R (First SEMESTER MARKS) (OHIO STATE U 


AVERAGE + AMERICAN CouNcr 


Achieved Rank 
7, 19, 20, 45, 85, 114, 116 


136, 137, 5, 6, 68, 87, 88, 139 


; , , 7 
80, 99, 128, 129, 90, 105, 119 


9, 32, 55, 58, 70, 75, 86 


1, 4, 13, 14, 52, 62, 124, 127 


? 7 ‘4 ‘ 
15, 27, 30, 54, 115, 131 


, ' 1 6 8 8 
2, 25, 26, 59, 67, 74, 83, 113 


22, 33, 51, 84, 89, 91, 94, 109, 117 


24, 40, 53, 64, 106, 107, 110 
18, 31, 44, 57, 104, 120, 135 
weerrten 
23, 37, 38, 63, 77, 78 
és 92, 96, 101, 108, 121 
130, 140, 43, 66, 69, 81, 138 
21, 29, 34, 39, 41, 100, 122 
134, 16, 28, 61, 98, 111, 118 
ener 


50, 72, 97, 103, 123, 132 


rane 


112, 133, 48, 73, 76 


‘Superscript shows rank of student in other distribution. 


ures, in the different deciles of the distribu- achieved marks. Students 5 and 20 in decile 
tion. The numeral just above the number ten of predicted marks are also in decile ten 
representing each student in the distribution of achieved marks, and therefore carry no 
of predicted marks shows in what decile that numbers indicating a shift to a different decile 
student may be found in the distribution of in the other distribution. 

achieved marks. For example, student num- A. brief inspection of these superscript num- 
ber 4, in the upper left corner of decile ten of bers reveals the fact that only one student 
predicted marks is found in decile eight of predicted to be in the tenth decile, that is, be- 
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tween 5.co and 4.31 grade points, fell below 
the sixth decile, or 3.37 to 3.13 grade points. 
Of the fifty-six students predicted to be in 
leciles seven to ten of the distribution, which 
ist about coincides with the official range of 
Ss (A) and G (B) marks at Phillips Univer- 
sity, forty-two are found to have achieved 
nk in those four deciles. Prediction, there- 
is about seventy-six per cent correct in 
this section of the distribution. Of the forty- 
two students predicted to be in deciles four, 

e and six, the area of the distribution 
roughly coincident with the mark M (C) at 
Phillips University, only sixteen are found to 
be in those deciles in the achievement distri- 

The prediction is only about thirty- 
cht per cent correct in this section of the 
stribution. Regression (3) evidently has 

w practical predictive value in this area of 

distribution. Of the forty-two students 
licted to be in deciles one, two, and three, 
which coincide with the area of the distribu- 
tion to which the marks of I (D) and F are 
issigned at Phillips University, twenty-four 
ire found to be in those deciles. Prediction 
this section of the distribution is about 
fifty-seven per cent correct. This is better 
than in the central portion of the distribution, 
but not se good as in the upper portion. 
\nother grouping of deciles in the distribu- 
is interesting, as well as useful, in the 
nalysis of the predictive efficiency of the dif- 
nt correlations used in the study. Phil- 
University requires an average mark of 
Mor 1.00 (transformed to 3.00 for the pur- 
of this investigation) for graduation. 
‘his may be termed the success mark in the 
school. It may be observed that this mark 
is in the fifth decile of the distribution of 
first semester marks on Table IV. Of the 
fifty-six students predicted to fall below this 
mark, that is, in deciles one to four, thirty- 
nine of them are found to do so. So predic- 
tion in this area is about seventy per cent cor- 
rect; that is, regression equation (3) can be 
used to predict about seven out of ten fail- 
ires at Phillips University, according to this 
inding. 

The superscript numbers may be used for 
1 still more detailed analysis of prediction in 
the different deciles of the distribution. For 
example, in Table IV, decile ten, it may be 
observed that student 4 has a decile displace- 
ment of 2; student 24 has a decile displace- 
ment of 4: student 26, 3; student 27, 2; stu- 
dent 58, 1; and student 94, 3. This is a 


l 
T 
i 


total decile displacement of 15 for the fifteen 
students in decile ten, or an average decile 
displacement of 1. But the fifth decile has a 
total decile displacement of 31, or an average 
decile displacement of more than 2. 

A word might be said in explanation of why 
prediction is better in the upper and lower 
parts of the decile tables than in the central 
part. 

This situation has been understood and dis- 
cussed by statisticians, as applied to the nor- 
mal curve. It has been described as sigma 
difficulty. At the upper end of the curve 
sigma difficulty is so great that only a few 
reach these ranks, and these are scattered over 
relatively wide stretches of the base line of the 
curve. At the lower end of the curve sigma 
difficulty is so small that most students rise 
to higher positions, leaving only a scattered 
few in that portion of the curve, and they also 
appear at relatively wide intervals on the base 
line of the curve. Hence, when a distribu- 
tion curve is thrown into a decile ranking, it 
is necessary to stretch the upper and lower 
deciles over relatively wide areas of the curve 
in order to include the necessary tenths of 
students in them. 

It may be observed in Table IV that decile 
ten covers an area of .69 grade points (5.00 
to 4.31) in the distribution and that deciles 
seven to ten cover 1.62 grade points, or an 
average of .405 grade points to the decile, 
whereas deciles four, five and six cover only 
.58 grade points, or an average of only .19 
grade points per decile in this section. Decile 
prediction, therefore, should be about twice 
as accurate in the upper portion of the dis- 
tribution as in the central area, which was 
found to be so in the compared analyses of 
deciles ten and five above. This same situa- 
tion prevails@though not to such a marked 
degree, in the lower area of the distribution 
as compared with the central portion, and 
accounts for the increased proficiency of pre- 
diction there. 

Another observation that may be made on 
Table IV is that no student predicted to be 
in deciles one to four achieved the tenth 
decile, and only one achieved as high as the 
ninth. Also, no student in the ninth and 
tenth deciles fell lower than the fifth. A stu- 
dent predicted to be in the tenth decile had 
1 in 1.67 chances of achieving the tenth 
decile; whereas a student predicted to be in 
the lower half of the distribution had only 
one in seventy-two chances of achieving the 
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tenth decile. So a student predicted to be in 
the tenth decile is forty-three times as apt to 
achieve that rank as is a student predicted to 
be in deciles one to five inclusive. 


Regression (3), Table IV, predicts first 
semester-college freshman marks for boys 
somewhat better than it does for girls. There 
are seventy-five boys and sixty-five girls in 
the distribution. The decile placement of 
eighteen boys, or twenty-four per cent, is 
exactly predicted, whereas the decile rank of 
twelve girls, or nineteen per cent, is exactly 
predicted. Forty-five per cent of the girls 
exceed the decile rank predicted for them, 
and forty-four per cent of the boys exceed it. 
Thirty-two per cent of the boys fall below the 
decile rank predicted and thirty-six per cent 
of the girls fail to reach it. Of the thirty- 
two boys predicted to be in deciles one to four 
inclusive (the area below the success mark at 
Phillips University), twenty-three or seventy- 
two per cent are found to have achieved that 
rank; and of twenty-four girls predicted to be 
in those four deciles, fifteen or sixty-three per 
cent are found to have achieved that rank. 
Of the twenty-nine boys predicted to be in 
deciles seven to ten (the area of S (A) and 
G (B) marks at Phillips University) twenty- 
one or seventy per cent were found to be 
there; and of the twenty-six girls predicted 
to be in those four deciles, twenty-one or 
eighty-one per cent were found to be there. 
It thus appears that regression equation (3) 
predicts the college freshman success of boys 
more accurately than that of girls in exact 
decile prediction throughout the distribution 
as well as in the degree of accuracy in pre- 
diction in deciles one to four. Prediction, 
however, is more accurate for girls in deciles 
seven to ten. The last statement is not true 
when limited to decile ten, in which prediction 
for boys is more accurate than for girls. This 
tendency towards more efficient prediction for 
boys at both ends of the distribution, based 
on regression equation 3, is also true, in gen- 
eral, of the other regressions used for predic- 
tion in this study. 


Regression equations 1, 2, 7, 8, and 9, in 
which first semester marks are also predicted, 
might be analyzed and presented as regression 
equation 3 has been in Table IV, but space 
will not be taken for such detailed treatment 
here. Similarities to, and differences from, 
the findings in Table IV as treated above, may 
be noted in Tables VI and VII, which are a 


(Vol. 


summary of findings in regression equations 

1 to 10 inclusive, treated as in Table IV. 
Students’ predicted rankings by the use oj 

regression equations 7, 8, and 9, which ar 


varied than they are when derived fro: 
gression equations 1, 2, and 3, which ar 
based upon multiple correlations. For exs 
ple, the rankings of student 119 by regres 
sions equations 7, 8, and g are deciles five 
nine and ten, a variation of five deciles 
whereas the rankings of student 119 in re. 
gression equations 1, 2, and 3, which are 
based on multiple correlations, are nine, ning 
and ten, a variation of only one decile. | 
student 22 the corresponding rankings ar 
four, five and eight, a variation of four deciles 
as compared with five, six, and six, a variatior 
of one decile. For student 30 the correspond- 
ing rankings are three, five and nine, a varia 
tion of six deciles, as compared with sever 
seven and seven, a variation of no deciles 
These are random choices and illustrate th: 
point under discussion. This finding 
trates how inadequate may be the practic: 
sectioning or ranking entering college {res} 
men on the results of a single measure 
Table V is a presentation of predicted 
achieved second semester marks in decile dis- 
tribution. This table is chosen for analysis 
because it is based upon a five-variable corr 
lation, and is the most efficient of the correla- 
tions used for the prediction of second semes- 
ter marks. A brief study of Table V will b 
sufficient to demonstrate that regression equa- 
tion 6, on which the table is based, is a mor 
efficient agent for the prediction of second 
semester marks than any of those used to pre- 
dict first semester marks. For example, exact 
decile prediction is twenty-nine per cent in 
Table V, as compared with twenty-four per 
cent for regression equation (3) on which 
Table IV is based, and this is the most effi- 
cient combination for predicting first semes- 
ter marks, as was seen above. The balance 
is also better between the per cent of students 
exceeding or falling below the predicted mark 
being thirty-seven and thirty-two per cent 
respectively for Table V, as compared with 
forty-five and thirty-four for Table IV. The 
superiority of Table V, however, is slight in 
predicting the marks of abler students, the 
per cent of correct prediction in deciles seven 
to ten being seventy-eight for Table V as 
compared with seventy-six for Table IV. The 
superiority of Table V is apparent in the cen- 
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TABLE \ 


-1LE RANKS OF PREDICTED SECOND SEMESTER MARKS OF 126 PHILLIPS UNIVERSITY FRESHMEN, 
934-1935, BASED UPON THE CORRELATION R (SECOND SEMESTER MARKS) (First SEMESTER 
\ARKS, HIGH SCHOOL AVERAGE, OHIO STATE UNIVERSITY PSYCHOLOGICAL TEST SCORES, AND 
ENGLISH PLACEMENT TEST SCORES) = 836, COMPARED WITH CORRESPONDING DECILE RANKS 
ACHIEVED 


a Predicted Ranks Achieved Ranks 
126, 45, 114, 137, 20, 7 136, 114, 1945, 7, 5, 20 
; 116, 19, 88, 139, 68, 87," 119 68, 139, 32, 90, 88, 119 

4, 80, 52, 129, 6, 58 137, 85, 116, 129, 1, 13, 70 


i iM 
wah 90, 32, 27, 75, 70, 9 6, 9, 27, 33, 54, 55, 58, 87 


85, 13, 24, 99, 23, 105 52, 62, 99, 110, 124, 26, 127 





epee 30, 33, 55, 11, 117, 135 131, 105, 15, 51, 89 | 


26, 1, 59, 127, 25, 2, 94 67, 83, 128, 4, 91, 120, 23 
109, 51, 86, 54, 15, 44 30, 75, 18, 109, 104 
‘4 128, 106, 126, 65, 101, 53 59, 2, 46, 25, 40, 53 
a 31, 43, 131, 18, 115, 120 24, 84, 44, 63, 31 
9 110, 14, 83, 124, 92, 113 80, 134, 101, 108, 3, 113, 106 
96, 64, 62, 104, 22, 63 135, 117, 78, 69, 115, 11 
36, 100, 102, 134, 66, 89 14, 121, 122, 107, 35, 132 
‘ | 
91, 28, 61, 78, 84, 38 86, 43, 37, 66, 22, 138, 16 
138, 140, 21, 107, 35, 121 96, 64, 65, 125, 126, 28 . 
¢ . 2 ‘ ‘ ‘ | 
64 , sin 40, 122, 5, 17, 97, 37, 76, 69 73, 81, 17, 76, 61, 38 | 
Zs: = , 3S | 
2.4 41, 130, 39, 46, 3, 50, 81 21, 140, 100, 71, 42, 36, 102 
am cum | 
2.21 ane ilies 16, 73, 111, 118, 132 92, 123, 50, 111, 97 
:. 2.20 125, 71, 56, 42, 133, 12, 82 39, 41, 8, 95, 10, 49, 56 
f 1 ‘ | 
DD icteutinenee 49, 95, 10, 8, 123 133, 12, 83, 130, 94, 118 
mt Superscript shows rank of student in other distribution. i 
th : 
ne tral areas. and marked in deciles one to four, dicted to be there are found there except four, 
in as may be observed on Table VII. In Table and none of them rise above the third decile. 
he V all students predicted to be in decile ten are A student predicted to be in the tenth decile 
oF found to be there except three, and all three has 1 in 1.3 chances of being there; whereas a 
as [of these are in decile nine. This high degree student predicted to be in deciles one to eight 
he of exact decile prediction is also to be ob- inclusive has no chance of being in decile ten. 


n- served in decile one, where all students pre- No student predicted to be in the lower half 
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ties of the distribution (deciles 7—10 and 
with nearly as much efficiency as first semes 
ter marks predict second semester marks 
those deciles. Since these are the areas 
superior scholarship and failure, regress 


1t10n rises above the eighth decile 


» be observed on Table VII that 
(2) with five variables 


equation 
marks in the extremi- 


semester 


TABLE VI 


ND Per CENT OF PHILLIPS UNIVERSITY FRESHMEN, 1934-1935, WHOSE ACHIEVEM 
EQUALLED, EXCEEDED, OR FELL BELOW THEIR PREDICTED MARKS, BY DECILES 


Equalled Exceeded Fell Bel 
Boys Girls All soys Girls Boys Gir 
21 9 26 28 29 
29 14 ‘ o7 


16 ‘ : 30 


{ 

a 

> 

oO 

«1 o. 

‘ 99) ») 

id oo “< 
4 

: 4° 

») 

») 

o> 


44 
24 


26 


l i 


cS lad 


! 


9 
4 
ures indicates number of students and th 
TABLE VII 
Per CENT OF PHILLIPS UNIVERSITY FRESHMEN, 1934-1935, WHOSE ACHIEVEMENT 
MARKS EQUALLED THEIR PREDICTED MARKS IN SPECIFIED 
AREAS OF THE DECILE DISTRIBUTION 


Deciles 7-10 Deciles 4-6 Deciles 1- 
G All B G All G 


NUMBER AND 


() 20) 10 bel 7 15 : 6 


1 s 


21 


17 


81 


SF 
‘ 

42 
76 


11 


Pel by 


tl 


ro 
IS D1) & OS 


on 


w 
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o 
our 


20) 

9 16 
16 
43 38 
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4 


e 


1 
i 
] 
t 
1 
} 
1 


a 
49 


11 


11 
61 
15 
50 
15 
68 


40 
6 


oi 


6 
66 
9 
41 
8 
33 
6 
50 
11 
68 


Ow onmwwmwcang 


to ONIN Ny 1-1 - 


“The upper row in each double row of figures indicates number of students and the | 
row shows per cent. 
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u (3) could probably be used almost 
- effective ly for guidance of entering college 
n as could first semester marks for 
e of students into work for the second 


fresh € 


ster. 

On Tables VI and VII it may be observed 
high school average marks are a little 
superior to mental tests in predictive effi- 
iency, in spite of the fact that Ohio State 
niversity Psychological Examination scores 
rrelate somewhat higher with first semester 
iarks. American Council Test scores are de- 
idedly inferior, in predictive efficiency, to the 
ther two, especially when limited to specified 
reas of the distribution. Exact prediction 
f decile placement is twenty-two per cent for 
sh school average, twenty for the Ohio test, 
ind nineteen for the American Council Test. 
[he balance between the percentage of stu- 
lents exceeding or falling below predicted 
irks when applied to the student body as a 
whole is somewhat better for high school aver- 
we, the percentages being thirty-nine and 
rty respectively for high school average, 
forty-three and thirty-seven for the Ohio State 
University Psychological Examination, and 
ty-three and thirty-nine for the American 
Council Test. However, when analyzed sep- 
rately for boys and girls, a different picture 
s presented. High school average is the 
poorest of the three measures in predicting 
exact decile placement for girls, the predic- 
mn for high school average being fifteen per 
ent, compared with twenty for American 
Council Test and seventeen for Ohio State 
University Psychological Examination. The 
balance between the per cent of students ex- 
eeding and those falling below predicted 
marks is quite different for the sexes taken 
separately from what it is for the student 
body as a whole. The boys, in much larger 
proportions than girls, fail to measure up to 
the marks predicted by the mental tests, forty- 
seven per cent of them falling below predicted 
marks in both mental tests, as compared with 
thirty-one and thirty-two per cent exceeding 
predicted marks. The reverse is true of girls, 
fifty-five and fifty-seven per cent exceeding 
marks predicted by mental tests, and twenty- 
four and twenty-six per cent failing to reach 
them. In predictions based upon high school 
average marks the position of the sexes is re- 
versed from that in the mental tests, forty-one 
per cent of boys exceeding predicted marks 
ind thirty-one per cent falling below, where- 
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as only thirty-five per cent of girls exceed pre- 
dicted marks and fifty per cent fall below. 
Table VI shows that high school average 
marks are also a little superior to the mental 
tests in predicting the marks of the abler stu- 
dents, when attention is centered on the stu- 
dent body as a whole, prediction by high 
school average being seventy-two per cent cor- 
rect compared with sixty-six per cent correct 
for the American Council Test and sixty-nine 
for the Ohio State University Psychological 
Examination. Superiority for high school av- 
erage marks in the middle deciles is decided, 
the prediction being forty-three per cent cor- 
rect for high school average, as compared with 
twenty-six for American Council Test and 
thirty-four for the Ohio State University Psy- 
chological Examination. In deciles one to 
four high school average marks are superior 
to American Council Test scores, but slightly 
inferior to Ohio State University Psychologi- 
cal Examination scores, the prediction being 
sixty-seven per cent for high school average, 
fifty-seven for American Council Test, and 
sixty-eight for the Ohio State University Psy- 
chological Examination. However, when the 
sexes are considered separately a quite differ- 
ent picture is presented again. Both mental 
tests predict the marks of the abler girls 
(deciles 7—10) better than does the high 
school average, the predictions being seventy- 
one and seventy-six per cent correct respec- 
tively for the American Council Test and the 
Ohio State University Psychological Exam- 
ination, as compared with sixty-four per cent 
for high school average. But the reverse is 
true for abler boys, the prediction being 
eighty-one per cent correct for high school 
average, and sixty-two and sixty-six respec- 
tively for American Council Test and Ohio 
State University Psychological Examination. 
In deciles one to four the mental tests pre- 
dicted the marks of boys better than those of 
girls, the prediction being sixty-four and 
eighty-one per cent respectively for the Amer- 
ican Council Test and the Ohio State Univer- 
sity Psychological Examination, and forty- 
eight and fifty-seven for girls. About the 
only statement one can make with confidence 
about the comparative predictive efficiency of 
high school average marks, American Council 
Test scores, and the Ohio State University 
Psychological Examination scores is that none 
is best for all levels of ability and for both 


sexes. 
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Some general statements may be made from 


lables VI and VII. First, 


observations on 


with two exceptions, the exact prediction of 
decile placement is better for boys than for 
girls, the average percentage of exact decile 


predicti 27.2 and for girls 
20.3. Second, girls, with but one exception, 
in larger than do boys, tend to 
exceed their predicted marks, the average per 
cent of girls exceeding predic tions being 40.9, 
ind that of boys 33.4; and conversely, boys 
in larger proportions than girls tend to fall be- 
low their predicted marks, the average per 
cent who fall below being 38.1 for boys and 
3 Third (on Table VII), on the 
whole, the measures used predict the marks of 
abler girls somewhat better than they do those 
of abler boys (deciles 7-10), the average per 
cent of correct prediction for girls being 76.8 
and that for boys 73.3. Fourth (on Table 
VII), the measures used predict the marks of 
the girl better, with one exception, 
than they do those of the average boy (deciles 
4~6), the average per cent of correct predic- 
tion being 43.8 for girls and 36.5 for boys 
Fifth (on Table VII), in the lower levels of 
the distribution, especially in the grouping of 
deciles 1-4, prediction for boys is better, with 
one exception, than it is for girls, the average 
per cent of correct prediction being 76.4 for 
boys and 62.8 for Sixth (on Table 
VIT), with two slight exceptions, the measures 
used predict marks of students in deciles 
seven to ten with a degree of accuracy above 
seventy per cent, when applied to the student 
body as a whole and that this prediction is 
not far from constant, the average prediction 
in this area being 74.7 per cent correct. This 
is noteworthy, since the ten correlations used 
is the basis of these predictions ranged from 
408 to .836. Seventh (on Table VII), no 
correlation used predicted as many as fifty 
per cent correct in deciles four to six inclu- 
sive, when applied to the student body as a 
whole, the average prediction in this area be- 
ing 39.9 per cent correct. Eighth (on Table 
VII), for the ten regressions used, the aver- 
age prediction in deciles one to four is 70.6 per 
cent correct when applied to the student body 
as a whole. This is not far from the effi- 
ciency of prediction (74.7 per cent correct) 
noted in deciles seven to ten. And so the 
statement may be repeated that prediction is 
far more efficient in the upper and lower areas 
of the distribution, and so low as to be of 
little apparent use in the central areas. 


m for bovs being 


proportiol ~ 


».§ for girls 


average 


girls. 
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The Critical Point in the Prediction of ( 
Marks 

Phillips University requires an averag 
mark of M (C), designated (3) for the pur- 
poses of this investigation, as indicated al 
for graduation. This may be called the succes: 
mark, and may be used as a criterion of 
cess in the school. The question then ari 
what mark should a student obtain the { 
semester of the freshman year in order to hav: 
aS many as 50 chances in 100 of achic 
this necessary average for graduation at 
close of his senior year? When this mark i: 
known in a school, it may be termed 
critical point in the marking system of 
institution. 


In an effort to determine the critical | 
at Phillips University two investigations wer 
made. First, the average was found of 
marks given in freshman courses for the first 
semester, 1934-1935; this was found to 
3.46. The average mark for all advanc 
courses for the same semester was found t 
3.42. The average mark in freshman course: 
for the second semester was 3.43, while t! 
in advanced courses was 3.48. It thus 
pears, in general, that no higher marks 
given in advanced courses at Phillips Univer 
sity than are given in freshman courses; a1 
that if a student hopes to achieve the r 
quired average of 3.00 for graduation 
must obtain about that average in the first 
semester of his freshman year. 


Second, the complete records of the marks 
of bachelor of arts graduates for the years 
1932, 1933, and 1934 were studied. The 
average mark for the first semester was con 
pared with the average mark for all other 
courses taken before graduation, for each stu- 
dent. Table VIII shows the results of this 
investigation. 

Table VIII seems to warrant the statement 
that marks received by bachelor of arts grad 
uates at Phillips University for the first se- 
mester of the freshman year are approximately 
equal to those obtained in later courses pur- 
sued. So we may conclude again that if a 
student hopes to obtain the average mark of 
3.00 required for graduation at Phillips Uni- 
versity, he must achieve approximately that 
mark for the first semester of his freshman 
year. Hence the critical point at Phillips 
University may be said to coincide approxi- 
mately with the success mark. 
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TABLE VIII 

MARKS OF BACHELOR OF ARTS GRAD- 

BS FOR THE YEARS 1932, 1933, AND 1934, 
MPARING AVERAGE MARKS FOR THE FIRST 
MESTER OF THE FRESHMAN YEAR WITH 

AVERAGE OF LATER MARKS OBTAINED 

First Average of 

Number of Semester Later Marks 

Students Average Obtained 

68 3.68 
57 3.69 
47 3.58 


172 3.65 3.66 


s conclusion is confirmed, at least in 
y the fact that of the forty-two stu- 
this study who failed to reach the 

d average of 3.00 for the first semester, 
continued their work for the second 

r, only nine achieved the mark. The 

f twenty-two of them were lower for 


TABLE 
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the second semester than for the first, while 
those of twenty were higher. Only one stu- 
dent whose average mark was below 2.46 for 
the first semester reached the success mark 
for the second semester Of the sixteen stu- 
dents of the study who withdrew from school 
it the end of the first semester, thirteen had 
failed to reach the success mark. 

On the assumption, then, that 3.00 is the 
critical point, as well as the success level at 
Phillips University, what are the chances that 
a student with a given mark, predicted by any 
of the regression equations used in this study, 
will achieve the success mark, and so will 
probably graduate from the school? The 
answer to this question, so far as the data and 
techniques used in this study can answer it, 
is presented in connection with regression 
equation : is study, in which the multi- 


ple corre mn first semester average ol 


IX 


Dr 


1ANCES OF FAILURE AND SUCCESS FOR MARKS PREDICTED BY REGRESSION EQt 
X, = .014X, + .500X; + .249, USING 3.00 AS THE CRITICAL POINT, 
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freshman marks with Ohio State University to succeed as he has to fail, that is 
Psychological Examination scores and high chances in one hundred, either way. 
001 average marks is .605, with a probable In order to illustrate more fully the 
error of estimate of + .436 of a grade point. of the critical point in the prediction 
[he regression equation is X,==.014X,-+ lege marks, Table IX, based upon the nor: 
ooN, + .249. Any mark obtained by the curve and derived from regression equat 
ise of the equation must be evaluated in given above, was prepared. For conve: 
terms of the probable error of estimate. Take in computation, probable error rating 
for example, student 136, whose predicted given in tenths. Beginning at the crit 
vark is 4.49. What are his chances of point, 3.00, which has a zero probable 
ichieving the success mark of 3.00? His pre- rating, one-tenth of a probable error, 
licted mark is 1.49 grade points above the of a grade point, is added to, or su 
required mark. Dividing this 1.49 by the from, the predicted score as the 
probable error of estimate, .436, the quotient error varies in tenths up or down t! 
is 3.4. This means that the predicted mark This technique is adopted from Segel 
of 4.49 is 3.4 probable errors above the criti- 55). Now the chances that any student y 
cal point of 3.00. This, based upon the nor- a given mark predicted by regression equati 
mal curve, gives student 136 about ninety- 2 will achieve the success mark of 3.00 maj 
nine chances (98.9) in one hundred of achiev- read directly from the table. Take student 
ing the success mark. Student 46, who has_ one, for example, with a predicted mark 
a predicted mark of 3.00 has the same chance 3.43. From Table IX, his chances f 


TABLE X 


STUDENTS WHOSE MARKS REGRESSION EQUATION Two (X, = .014X, + .500X, + .249) Prep 
TO BE BELOW THE CRITICAL MARK, THEIR CHANCES OF ACHIEVING SUCCESS, 
AND THEIR ACHIEVED MARKS 


Predicted Achieved % Suc- Predicted Achieved 
Student Mark Mark cessful Student Mark Mark 
50" . 2.99 2.47 49 2.68 2.57 
2s 2.99 2.7 49 gf 2.68 1.83 
: 2.98 2.13 48 2.68 3.06 
104 2.9% 3.14 47 2.67 2.87 
37 2.95 3.00 47 2.67 4.07 
18 2.95 2.00 47 a 2.66 2.70 
46 ‘ 2.64 3.43 
45 2.60 3.44 
44 2.61 3.19 
43 .58 2.67 
43 57 3.38 
42 56 2.19 
42 ; 2.85 
42 5S 2.18 
41 5! ner 
5% 3.1: 
39 
38 
50 37 
3.00 37 
3.77 35 
2.92 35 
2.92 34 
3.44 34 
3.00 32 
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ire observed to be seventy-five in one 
ed, and his chances for failure twenty- 
Student three, with a 

ted mark of 2.28, has fourteen chances 
e hundred of achieving the success mark, 
eighty-six chances in one hundred of fail- 


one hundred. 


do SO. 


es X and XI present a complete list of 
tudents of this study with their marks 


TABLE 
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predicted by regression equation 2 (X, 

.o14X, + .500X, + .249) and _ interpreted 
with reference to the critical point by the use 
of Table IX. It may be observed from Table 
X that the chances of reaching the success 
mark, for students predicted to fall below the 
critical point, range from forty-nine in one 
hundred down to seven in one hundred 
Although, theoretically, all students in the 


XI 


NTS WHOSE MARKS REGRESSION EQUATION Two (X, 014X, + .500X, + .249) PRE! 
TO BE ABOVE THE CRITICAL POINT, THEIR CHANCES OF FAILING, 
AND THEIR ACHIEVED MARKS 


Predicted 
Mark 
4.49 
4.26 
4.25 


oA 
ee 


4.20 
4.09 
4.05 
4.03 
4.01 
.96 
3.94 
91 
3.91 


) 
} 
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) 
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9 
3.90 
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> 
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3.90 
86 


34 SAI 
wo 2 
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tnonon ¢ 
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nhor ¢ 
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3.50 
3.50 


Achieved 
Mark 
5.00 
4.37 
50 
63 
14 
31 
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21 
21 
21 
22 


99 
— 


*Underscored numbers refer to girls. 


Predicted Achieved 
Student Mark Mark ‘ailing 
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3.49 
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have some chance of achieving 
the success mark, it may be seen from the 
n of achieved marks that no student 

with a predicted mark below 2.52 succeeded 
o. Prediction with reference to the 
point in the lower part of the distri- 
hen applied to both sexes together is 
per cent correct, thirty-seven of 
udents predicted to fall below 
When 
ne prediction in this area 
\t correct, twenty-two of 
predicted to fall there 
When applied to girls 
prediction is only fifty- 
fourteen out of twenty- 
found 


distribution 


ing found to do so 


the 


Del 


twent 
level 
this area being 


lable XI that predic- 
point, when applied to 
together, is much more accurate 
below the critical point. 

ve students were predicted to exceed 
tical point, and sixty-nine, or eighty-one 
nt of them, were found to do so. Pre- 
ilso more nearly equal for the 
nsidered separately, being seventy- 

nt correct for boys and eighty-two 
Chances in one hun- 
ibove the critical point, 
from one up to 
every student of the 
is some chance of falling below 
critical point, but it may be observed that 


the case 


iction here 1s 


ct for girls 

lure to be 

in this area, are 
Pheoretically, 
bution h 
ne wit 1 predicted mark above 2.52 falls 
low that nt; just as no student predicted 
s>2 rose above it. Pre- 
reference to the critical point is, 
the distribution 


w mar! 
th 


vertect in above 


- 
| beiow mark 2.52, that is, above 


probable error rating 1.2 and below prob- 
That is to say, the 
use of this tech- 
almost entirely confined to the area 
t probable error around the critical point 
When marks predicted by another regres 
sion equation are to be evaluated by this tech- 
‘ther table based upon the probable 
error of estimate of that equation must be 
constructed, or the chances for success of each 
student’s mark must be figured separately, as 
was done in the case of student 136, above. 
\ll parts of Table X however, except the first 
column, would remain constant for all the re- 
gression equations used in the study. Pre- 


dicted marks in the first column, correspond- 


ible error rating 2.5. 
error in prediction by the 


nique Is 


nique 
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ing to the probable error ratings, would 
according to the size of the probable er: 
estimate of the regression equation for wh 
the table was being constructed. In a tal 
prepared for use with regression equatio: 
for example, whose probable error of esti: 
is .294 of a grade point, the predicted 
corresponding to a probable error rating 
1.0 on the table would become 3.29, instead 
3.44, aS appears in Table IX; wherea 
score of 3.44 on such a table would ha 
probable error rating 2.6 above the crit 
point of 3.00, and the student would 
ninety-six chances in one hundred of ac! 
ing the success mark, instead of seventy 
chances in one hundred, as in Table IX 


Differential Prediction of College Mark 


The question of the student’s relative 
cess in the different subjects is a problem s 
ond only to that of his general success in t! 
institution; and his general success is cl 
tied up with, and often dependent upor 
the different subjects. This 
especially so in schools where many prere: 
sites for graduation exist in the different 
partments and curricula. Tests and measu 
thus far devised correlate higher 
eral success than with marks in the separ 
subject fields. The labor involved in the st 
tistical processes necessary for predictior 
the separate fields has also been a deterret 
to investigations here. However, a techni 
has recently been developed by which thi 
diction of differences in a student’s achie 
ment in the different subject fields is great 
facilitated. David Segel is the author of this 
procedure (11, pp. 76-89), a brief descript 
of which follows: 


success in 


with 
Wi 


Segel’s formula for finding the correlat 
between a predicting agent and the differ 
of marks received in two subject fields 
’ x , : = ‘. a where 

Ve - Oy 27 .)0,0 

is one subject or field and (6) another, a! 
(x) is the predicting agent. This is simp! 
an expansion of the regression r,,, wit! 
(a — b) taking the place of x, and can easily 
be shown to produce the same results as if 
the individual differences in marks in the tw 
subjects were found and these correlated with 
the x-scores. Segel demonstrated this. Ff 
example, where (a) is mathematics and | 
is science, and (x) is the Ohio State Univer- 
sity Psychological Examination; and sign 


A 





irks, as compared with that of the mental 








11, sigma (6) 1.07 and sigma (x) 
ind the correlations of (a) and (x) is 
(6b) and (x) .364, and of (a) and 
4, then fr, 
s61 xX 1.11 .364 X 1.07 
1.077 — 2(.714 X 1.11 X 107) 
This means that students with 
Ohio State University Psychological 
res will make higher marks in mathe- 
than in science, while those with low 
in Ohio State University Test may do 
| in science as in mathematics, or pos- 
better. The regression equation for the 
of (a—b) from (X) is X,y 


(xX M,) + M,, , where 


s the difference between the marks in two 
fields, M, is the mean of the predict- 
nt and M,,_») the difference in the 
of marks in the subject fields. Using 
ta in the example above, in which sigma 
s .825, and M, is 72.84, M, Is 3.54, 


: | <r : - 
i 4.89; me .283 “— (X — 72.84) 


Q 

994.X 216 The formula for 

ble error of estimate is PE 
fa.vwer x O rxx), in 
is the reliability of the X-variable 
predicting agent. In our example, 
6745\/[(.283) (.825)]?* (1—.92) 
technique described above was used in 
nvestigation in an effort to determine 
‘ll Ohio State University Psychologi- 
Examination scores, American Council 
ores and high school average marks can 
t differences in marks to be achieved in 


first semester of the college freshman year, 


1e different subject fields. All the data 
| for this phase of the study are found 
le I. From these data the correlations 


wn in Table XIJT were calculated. 
\ brief study of Table XII reveals the fact 


orrelations of high school average marks 
differences in subject field marks are 
in general, than are correlations of the 
tests with differences in subject field 
This would seem to indicate that 
school average is superior to mental 


s in the differentiation of student achieve- 


) the subject fields. In practice, how- 
this does not prove to be so, chiefly be- 
of the unreliability of high school 
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tests. The reliability of high school marks, 
as used in this study, is estimated at .60, 
while that of both mental tests is .93 or above. 
his causes the probable error of estimate to 
be much higher, in most cases, for equations 
in which high school marks is a factor than 
for those involving the mental tests. This 
situation stresses the supreme importance of 
reliability in predictive studies. A relatively 
high correlation coefficient counts for but lit- 
tle, if the probable error of estimate is also 
high. 

It may also be observed from Table XII 
that correlations of subject differences with 
the two mental tests are, in general, about the 
same, except where foreign language is a mem- 
ber of the prediction pair. In that case the 
results are quite different. The Ohio State 
University Psychological Examination exalts 
foreign language to a place slightly above 
mathematics and far above English, science 
and social science, whereas the American 
Council Test rates mathematics far above for- 
eign language, and places foreign language on 
ibout equal terms with English, science, and 
social science. It may also be noted that the 
probable errors of estimate are, in general 
about the same for the two mental tests 

From data given in Tables I and XII, four 
teen multiple correlations were run, in an ef- 
fort to find the most efficient combination of 
factors for the predictive differentiation of 
student achievement in the subject fields 
The six highest of these correlations, together 
with their probable errors of estimate and re- 
gression equations are shown in Table XIII 

Only two of the correlations in Table XIII 
are above .5o, and each of these contains high 
school marks as a factor in the prediction 
team. It may be observed again that the 
probable error of estimate is larger in the 
equations involving high schools marks (X,) 
as a factor in the predicting team. This fact 
again offsets the seeming advantage of higher 
correlations in the first two prediction pairs 
in the table, making the predictive value of 
those equations very small. 

From Tables XII and XIII, the following 
regression situations were chosen for predic- 
tion in this phase of the study: Only those 
equations with r or R above .150 were used. 
The writer is aware of the fact that only two 
of these correlations are above the minimum 
usually set for predictive purposes. But, 
since the chief purpose of this study is to de- 
termine the predictive value of the measures 
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TABLE 


PROBABLE ERRORS, AND PROBAB 
YCHOLOGICAI 


AVERAGE MARKS 
FIRST SEMESTER, 


DIFFERENC 
1935 


WITH 
1934 


250 


06] 


180 


and 


Is a 
‘he lower number in each case is the pt 


TABI 


MULTIPLE CORRELATION COEFFICIENTS, WITH 


EXAMINATION SCORES, 


IMENTAL EDUCATION 


XII 

OHIO STATE | 
SCORES, AN 
SUBJECT 


Oo} 
TEST 
IN THE 


LE ERRORS OF ESTIMATE 
AMERICAN COUNCII 
IN COLLEGE MARKS 


ES 


fist? 

190 

O89 

164 

063 .130 066 

+ .041 054 
067 .067 074 
O11 7 


057 
065 O78 
.032 


047 


+ —.041 


061 


4 065 


the second b, as later used in the ex; 


‘obable error of estimate. 


E XIII 


PROBABLE ERRORS OF 


ESTIMATE, AND REGRESS 


Math 
Eng 
Math 


So Sci. 
Lang. 


science 


EQUATIONS FOR PREDICTION PAIRS IN CERTAIN SUBJECT FIELDS 


PE of 
Estimate 
+ 249 
+ 256 

157 


Regression Equation 
.0023X;, .0021X; 613X 
.0239X, + .0069X, + .884X 
.003X, .0009X, + .455X 


R 
521 
.654 
.405 


+ + 


.405 
468 


.261 


Math 
Math 
Math 


English 
Science 


English 


used, and not to pass judgment upon the ade- 
quacy of such value, a relatively large number 
of correlations and regression equations have 
been included for illustrative and comparative 
purposes: 

t. The correlation of mathematics minus 
social science with Ohio State University Psy- 
chological Examination scores is .171, with a 
probable error of estimate of 031. The 
regression equation is X4 == .0co84X, + .158. 

Che correlation of mathematics minus 
social science with American Council Test 
scores is .181, with a probable error of esti- 
of The regression equation is 
. OF 


0o°0. 


‘ 


mate 


\ 


UU 4, 


0005X, + .0019X, + .18X 
.0164X, — .0021X, 48 
.0024X .0024X, — .29 


.130 
.090 
028 


3. The correlation of mathematics minu 
social science with high school average marks 
is .377, with a probable error of estimate o! 

.164. The regression equation is X 
.651X, — 1.74. 

4. The correlation of English minus 
eign language with Ohio State University Psy- 
chological Examination scores is — .171, with 
a probable error of estimate of + .026. Thi 
regression equation is X, — .0068X 
575. 

5. The correlation of English minus for- 
eign language with American Council Test 
scores is .085, with a probable error of esti- 
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of 11. The regression equation is 
014X., — .I15. 

The correlation of English minus for- 

language with high school average marks 


409, with a probable error of estimate of 
146. The regression equation is X, 
>, 2.14. 


[he correlation of mathematics minus 
sh with Ohio State University Psycho- 


gical Examination scores is .209, with a 
bable error of estimate of 025. The 
regression equation is X4 == .0067X, — .208. 


8. The correlation of mathematics minus 
English with American Council Test scores is 
with a probable error of estimate of 
25. The regression equation is XY 
3X, — .214. 

The correlation of mathematics minus 
English with high school average marks is 
9, with a probable error of estimate of 
067. The regression equation is X, 
es Yy 
10. The correlation of mathematics minus 
science with Ohio State University Psycho- 
gical Examination .283, with a 
robable error of estimate of 055. The 
regression equation is X4 == .0094X, — .315. 
11. The correlation of mathematics minus 
science with American Council Test scores is 

with a probable error of estimate of 
022. The regression equation is X, 
9028X, — .og1. 
12. The correlation of mathematics minus 
cience with high school average marks is 
391, with a probable error of estimate of 


.705. 


scores is 


.138. The regression equation is X, 
542X, — 1.717. 
13. The correlation of foreign language 


minus social science with Ohio State Univer- 
sity Psychological Examination is .180, with 


1 probable error of estimate of .032. The 
regression equation is X, == .oo85X, — .2009. 
14. The correlation of foreign language 


minus science with Ohio State University Psy- 
chological Examination is .250, with a prob- 
able error of estimate of + .o41. The regres- 
sion equation is Xq== .0114X, — .820. 

15. The correlation of mathematics minus 
foreign language with American Council Test 
scores is .196, with a probable error of esti- 
mate of .002. The regression equation is 
Xa == .0045X, — .37. 

16. The correlation of mathematics minus 

reign language with high school average 
marks is .385, with a probable error of esti- 
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mate of .190. The regression equation is 
X -751X 2.540. 

17. The correlation of mathematics minus 


science with Ohio State University Psycho- 
logical Examination and American 
Council Test 468, with a probable 
error of estimate of .ogo. The regression 
equation is X o164X, 21X 48. 

18. The correlation of mathematics minus 
English with Ohio State University Psycho- 
logical Examination and American 
Council Test scores is .261, with a probable 
error of estimate of .028 The regression 
equation is X 0024X, 0024X .29. 

19. The correlation of mathematics minus 
social science with Ohio State University Psy 
chological Examination \merican 
Council Test scores and high school average 
marks is .521, with a probable error of esti- 
mate of .249. The regression equation is 
Xa .0023X, + .0021X, + .613X 1.60. 

20. The correlation of English minus for- 
eign language with Ohio State University Psy- 
chological Examination scores, American 
Council Test scores, and high school average 
marks is .654, with a probable error of esti- 
mate of + .256. The regression equation is 
Xa .0329X, + .0069X, + .884X, — 2.08. 

With each of these equations the differences 
in marks between the pair of subjects indi- 
cated were predicted for each student and the 
mean of these predicted differences found. 
This mean was used as the basis for the de- 
termination of the reliability or efficiency of 
the predictions. If the predicted difference 
of a student in a given pair of subjects varies 
as much as four probable errors of estimate 
from the mean his score is considered en- 
tirely reliable. If his predicted difference is 
two probable errors away from the mean his 
score is eighty-two per cent reliable, that is, 
the chances are eighty-two in a hundred that 
the difference between those two subjects will 
be in the same direction for other students 
who have the same marks in the predicting 
variable or variables, as this student has. 
This technique was adopted from Segel (11, 
p. 85). 

The results of these predictions are shown 
in Table XIV. It may be observed that 
wherever High School Average marks (X,) is 
the predicting agent, or a member of the pre- 
dicting team, the predictive efficiency is low, 
compared with that of the mental tests. It 
may also be noted that there is little to choose 
between the predictive efficiency of the Ohio 


scores 


scores is 


scores 


scores, 
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University Psychological Examination — per cent of the cases on average, if the equa 
Council Test in this phase tions in which high school marks occur 
' omitted. This may possibly be of some valy 
; somewhat difficult to generalize on pre- in advising with students about these subject 
dictive efficiency, as between pairs of subjects when they are prerequisite to certain 
listed in Table XIV. Wherever mathematics  cula that are basic to given vocations 
1 social science is a member of the prediction 
the discrimination in achievement is, in 
t. This is especially true of The purpose of this study was to det 
\ somewhat surprising dis- mine the efficiency of certain measures a 
ippears between English and for- combinations of measures in the prediction 
re, the efficiency of prediction be- college freshman marks, no attempt being 
is great between these two subject made to place an absolute evaluation up 
between Mathematics and English. the determined efficiency, or to increase it 
the whole of Table XIV into consid- Five variables were used as general predi 
1e might generalize by saying that tive agents. They are the Ohio State U: 
ields, ranked from highest to versity Psychological Examination, Form 18 
on the basis of the chances a student the American Council Psychological Test 
cceed in them, take the following high school average marks; Purdue Place 
\lathematics, English, Foreign Lan- ment Test in English; and first semester 
ruage, Science, and Social Science. lege freshman marks. In addition to thes 
[he practical value of these findings is general measures, separate marks in the 
omewhat obscure. The last column of Table — subject fields of English, mathematics, scier 
XIV shows that in only four of the equations foreign language, and social science, on b 
re predictions entirely reliable in more than — high school and college levels, were include 
sixty per cent of the cases. There is a con- Intercorrelations among these measur 


V. SUMMARY 


siderable probability that a difference exists were calculated, furnishing a basis for certa 
between the subject pairs in the directions in- direct deductions, as well as data for multi 
dicated, in about five-sixths, or eighty-three correlations and regression equations, a 


TABLE XIV 


PREDICTED DIFFERENCES OF COLLEGE FRESHMAN MARKS IN THE SUBJECT FIELDS, AN! 
PERCENTAGE OF STUDENTS VARYING AS MUCH AS TWO OR FOUR PROBABLE ERRORS 
oF ESTIMATE FROM THE AVERAGE IN THE TWENTY REGRESSION SITUATIONS 


Per Cent of Per Cent 
tegression Av. Pre- Pupils Two PE* Pupils F 
Situation dicted Diff. from Average PE" from A 

l 


») 


‘ 79 69 
94 94 
53 16 


RO 60 
60 


=> Co -1 0 
aime sin C 


+ 


b 
17 
18 
19 
20 


l 
1 
1: 
le 
1¢ 
1! 
l 


‘These are eighty-two per cent reliable 
These are entirely reliable. 
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-ypplying materials for differential predictions 
jong the subject fields. 
From a larger number of regression equa- 
ns developed, ten were used for the predic- 
of college freshman marks. The pre- 
cted marks were analyzed (a) by compar- 
the averages and sigmas of predicted 
irks with the averages and sigmas of cor- 
responding achieved marks; (b) by noting 
he difference between predicted and achieved 
rks for each student, and finding the aver- 
ve deviation and range of difference between 
the two; (c) by throwing the one hundred 
forty students of the study into a decile distri- 
ition for both predicted and achieved marks, 
each student being identified both individually 
ind by sex; and (d) by comparing predicted 
and achieved marks with reference to a criti- 
il point, and noting the efficiency of predic- 
on with reference to that point. 
VI. CONCLUSIONS 
here is little to choose between the pre- 
tive efficiency of high school average marks 

d Ohio State University Psychological Ex- 

\ination scores, when no distinction of sexes 

made. High school marks predict college 
narks for boys better than for girls; Ohio 
State University Psychological Examination 

res predict marks better for boys in the 

wer part of the curve, while the reverse is 
true in the upper part. Prediction by the 
\merican Council Test is inferior to that of 
the high school average or Ohio State Uni- 
versity Psychological Examination; but the 
order of efficiency as between the sexes and as 
between the upper and lower parts of the 
curve, is the same as that of the Ohio State 
University Psychological Examination. 

2. Analysis of correlations between specific 
high school subjects or subject fields and cor- 
responding college subjects or subject fields 
yields little evidence to support the traditional 
practice of demanding prerequisites or credit- 
patterns in high school as essential to success 
In college. 

3. Relatively greater accuracy in the upper 
and lower deciles of a ranked distribution, 
sensed but not explained by Whipple (15, p. 
262ff), permits the practical use of smaller 
correlations for prediction than traditional 
thought has sanctioned. 

4. Given a multiple correlation and a zero- 
order correlation approximately equal in size 
(with no statistically significant difference be- 
tween the two), this study offers evidence that 
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the multiple correlation is the more reliable 
for purposes of prediction. 

5. There is evidence that none of the gen 
eral measures used in this study has equal 
predictive efficiency for both sexes, the men- 
tal tests relatively over-rating the boys and 
high school marks relatively over-rating the 
girls. When combined, high school marks 
and mental tests tend to counteract each other 
and to yield more uniform results with the 
sexes. 

6. It is possible, and perhaps desirable, to 
determine the “critical point” in the marking 
system of a given college or university, that is, 
the point below which a freshman may not 
fall and have as much as fifty chances in one 
hundred for graduation. The mark in the 
distribution one probable error below the 
“critical point” is the minimum mark a stu- 
dent may receive and retain any hope for 
graduation. This may be called the “fatal 
point’ in the marking-system of the school. 
Also, the mark one probable error above the 
“critical point” is the minimum mark a stu- 
dent may receive and retain any fear of fail- 
ure in the institution, according to the find- 
ings of this study. This point may be termed 
the “safety point” in the marking system of 
the school. 

7. In differential prediction of subject field 
marks, reliability of the predicting agent is of 
paramount importance. This is demonstrated 
and emphasized by the fact that mental tests, 
with relatively high coefficients of reliability, 
are, in spite of their generalized nature and 
relatively low correlations with subject differ- 
ences, more efficient in prediction than are 
high school marks with relatively high corre- 
lations with differences in subject field marks, 
but with low coefficients of reliability, accord- 
ing to the technique of differential prediction 
used in this study. 


VII. PROBABLE VALUE OF THE FINDINGS 


Although absolute evaluation of the meas- 
ures used is not a major objective in this 
study, a word might be said about the prob- 
able practical use of the findings. 

(a) The widely varying degrees of predic- 
tive efficiency as between the sexes, noted in 
each of the regressions used, should be of 
some service in warning administrators and 
personnel workers against the practice of 
using the same predictive agent for the whole 
student body, upon which grave decisions are 
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made that are vital to the future welfare of 


the student 


b) The decile technique used in this study 


to analyze the results of prediction, both as 
between the sexes and the abler and weaker 


tudents, oO 
1 quintile distribution based upon a five-point 
rrading-system, may possibly be of service in 
the research work in this field to 
investigations more 


r perhaps a similar device such as 


enabling 
make the results of his 
concrete and practical. 

(c) Demonstration of the superiority of 
multiple correlations over that of zero-order 
correlations in the assignment of more uni- 
form rankings to a given student should serve 
as a warning against the practice of sectioning 
or pigeon-holing students on the results of 
predictions from a single variable. 

(d) The differences in predictive efficiency 
among the general predictive agents used in 
this study, while not entirely significant sta- 
tistically, may possibly be of some value to 
administrators and personnel workers in se- 
lecting materials for the evaluation and guid- 
ince of their students 

(e) Demonstration of the relative efficiency 
of comparatively low correlations in the pre- 
diction of the ranks of students in the upper 
and lower deciles of the group may tend 
towards a readjustment in our thinking as to 
the minimum correlation that may be of serv- 
ice in college administration. If it will be of 
service to administrators and personnel work- 
ers to know at the time of enrollment, or soon 
thereafter, the approximate rank that seven to 
eight-tenths of the students will attain in their 
college work, then this will be so. 


(f) The fact that the Ohio State Univer- 
sity Psychological Examination rates mathe- 
matics and foreign language far above the 
other subject fields of English, science and 
social science, and that there is a wide differ- 
ence in the relative rating of foreign language 
by the Ohio State University Psychological 
Examination and the American Council Test, 
as revealed by the differential prediction tech- 
nique used in this study, suggest the possibil- 
ity of using this technique with profit for the 
validation of tests. 


VIII. Questions ror FuRTHER Stupy 


1. Why do girls exceed and boys fall short 
of predictions from mental tests? 


Why do boys exceed and girls fall sh 
predictions from high school marks? 
How can the reliability of college mark 
raised ? 

How can character traits be scientifi 
included in prediction teams? 

Why are a student’s rankings predicted by 
regressions based upon multiple correla- 
tions more uniform than are his ranking 
predicted by regressions based upon zer 
order correlations of about the sam 
magnitude? 
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THE RELATIONSHIP BETWEEN THE TYPE OF QUESTION 
AND SCORING ERRORS 


Jack W. DuNLAaP 
Fordham University 


\ point often advanced in favor of objec- 
tive tests is their freedom from error in scor- 
ing. Nevertheless, that scores on objective 
tests are frequently in error has been shown 
by Pintner,° Dearborn and Smith,’ and 
Herbst. Several techniques have been pro- 
posed to overcome scoring errors: notably the 
self-scoring tests of Clapp and Young; print- 
ing the correct response on the answer sheet 
recorded his choice 
(proposed by mimeographing the 
answers on the answer sheet (a mod- 
method proposed by 
a mechanical device 
(designed by 
scoring devices 


after the student has 


loops) : 
correct 
loops’ 


lestometer, 


ification of 
Cuff"); the 
for determining the 
Cuff?); and such automatic 
as the Perfo-scorer, the Thermo-scorer, and 
the Chemo-scorer (developed by Peterson 
and Peterson’). 

the causes for 
ire: carelessness in marking the response, mis- 
takes in counting the number of items marked 
wrong, or omitted, errors in arithmetic, 
errors in transmuting scores, and errors in 
transferring scores. It has been suspected for 
a long time that an important cause of error 
in scoring is the form in which the questions 
are presented, but so far as the writer knows, 
little attention has been given to the problem, 
at least in the literature. 

This paper is concerned with the factor of 
type or form in which the question is pre- 
sented and its relation to scoring errors. The 
basic question is; are certain types of test 
questions more subject to scoring errors than 
By type is meant the mechan- 
ical form of presenting the question and 
recording the response. Another question on 
which these data may throw some light is; do 
scorers tend to underscore or overscore a 
test, that is, tend to give insufficient credit or 
to give undue credit? 

Three hundred ninety-eight Terman Group 
Tests of Mental Ability scored and rescored 
for another purpose furnish the basic data 
for this study*. The Terman Group Test of 


“The writer is indebted to Mr. A. Kroll, of Benjamin 
Franklin High School, New York City, for making these 
data available 


score 


Some of errors in scoring 


right 


are others? 


Mental Ability has ten subtests. This set 
papers was analyzed to determine the 
ber of papers having scoring errors for 
subtest, and to determine the total nu 
of such errors occurring in each subtest 
papers were scored by thirty teachers un 
supervision. Three or four, and occasiona 
more, teachers scored each subtest. Whe: 
group was particularly slow in scoring a sub- 
test, unoccupied scorers were asked to assist 
Slowness in scoring a particular test may have 
been due to any one or a combination of 
causes, such as length of the test, inherent 
difficulty in scoring that test, or to the slow- 
ness of that particular group of scorers. The 
data secured by this routine of scoring are 1 
ideal for this purpose, since it may be « 
tended that if a given teacher is prone to « 
type of error, the subtest she graded would 
show a disproportionate number of errors oi 
that type. The ideal situation would hav 
been for each teacher to have scored the sam 
number of blanks for all subtests. It may be 
assumed, however, since three, four, or mor 
teachers scored each subtest, that such « 
stant errors tend to cancel out from one test 
to another. Even if this be false, it is worth- 
while to examine the data as to the relation- 
ship between errors and types of questions 


An examination of Table I reveals that the 
number of papers having errors varies from 
11 in Test Nine to 116 in Test Three. The 
number of subtests originally overscored i 
236, while the number underscored is 208. 
Thus, out of 3980 subtests, 534 or 13% wert 
in error. Since the number of items varies 
from test to test, the likelihood of an error of 
scoring occurring, other things being equal, 
will be greatest in the longer tests. For pur- 
poses of comparison, therefore, the tests have 
been equated to a common base of twenty 
items. The data after equating are shown in 
the bottom part of Table I. The number of 
subtests having scoring errors after adjust- 
ment of the tests to a common length is 555. 
Thus, out of 3980 subtests, fourteen per cent 
are in error. The number of test papers hav- 























SCORING 


1 2 3 4 5 

+ 18 30 50 31 7 

— 4 S 66 46 15 

T 27 38 116 77 22 

Rank 4 5 10 8 3 
Items 

In test 11 30 20 12 











Number of Papers in Error i 
+ 18 55 33 31 12 
— 9 15 44 46 25 
7 = 70 77 77 37 
Papers 
In Error 8 17.6 19.3 19.3 
Rank 2 7 8.5 8.5 4 


¢ errors varies from 12 in Test Nine to 103 
Test Eight. 
[he first question to arise is; can the varia- 
tion between the various subtests be ascribed 
chance? The simplest method for testing 
this is the method of chi square*. Since the 
total number of subtests in error is 555, and 
these fall into ten classes, the best estimate 
{ the number to be expected if only chance 
s operating to cause errors in scoring is the 
average, namely, 55.5. There are nine de- 
grees of freedom here, and the value of chi 
square is 128.95. The one percent point with 
nine degrees of freedom is 21.666, so that 
there can be no question that the difference 
is due to some factor other than chance. 


Che total number of scoring errors is shown 
by subtest in Table II. Again, for the pur- 
poses of comparison, the tests have been 
equated to a common length of twenty items. 
The total number of errors ranges from 15 
for Test Nine to 254 for Test Three. The 
last two lines of Table II show the average 
and median number of errors occurring by 
subtests for those blanks having errors. The 
averages range from 1.4 per paper in Test 
Nine to 6.3 in Test Two. Test Two is a 
best-answer test, where one phrase in a series 


*The method of chi square is old, and although it is an 
extremely important method, it has not been widely used. 
For the convenience of the reader who has forgotten or is 
unfamiliar with the method, the following references are given: 
_ Garrett, H. E. Statistics in Psychology and Education, 
Longmans Green and Co., 1937, Pp. 119-124. 

Guilford, J. P. Psychometric Methods, McGraw Hill Book 

1936, Pp. 92-93, 180-181. 

Holzinger, K. J. Statistical methods for Students in Edu- 

cation, 1928, Pp. 245-248. 
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TABLE I 
THE NUMBER OF PAPERS IN ERROR BY SUBTEST WHEN RESCORED 
4 PLUS INDICATES PAPERS ORIGINALLY OVERSCORED; A MINUS, PAPERS UNDERSCORED. 


Subtests 





ERRORS 





398 


N - 


Total 
6 7 8 } 10 Subtests 
In Error 
17 27 41 4 11 236 
48 40 51 7 bal 298 
65 67 92 11 19 534 
6 7 9 1 2 
24 20 18 18 12 





if All Tests Had 20 Items 

14 27 46 4 18 268 
40 40 57 ~ 13 287 
54 67 103 12 31 555 
13. 16.8 ».0 7.8 





of three is marked. It is unbelievable that 
6.3 items out of twenty, of 31.5 per cent, 
were misscored. The median, however, is 2.0 
or 10 per cent. This extremely high average 
may be due to error of a single scorer but in 
opposition to this hypothesis is the fact that 
other tests of a similar form, namely Test Five 
and Test Ten also show undue error in 
scoring. 

That the differences in number of errors 
from subtest to subtest is not ascribable to 
chance is shown by the chi square value of 
480.5. Again with nine degrees of freedom, 
the value of chi square for the one per cent 
level is 21.666; so there can be no question 
that some factor other than chance 
operating. 

In Table III, the subtests are classified as 
to type of question. Tests Three, Six, and 
Eight are two-choice tests, respectively yes—no, 
same—opposite, and true—false. Tests One, 
Seven and Nine are multiple choice tests 
where the student underlines a single word in 
the text. Test four is also a multiple choice 
test, but the subject must underline two 
words in the text. In the last three Tests, 
Two, Five, and Ten, the correct response is 
written on the margin of the test. Under the 
heading “papers in error” is given the rank 
of the test in terms of the raw data and the 
rank after the tests have been adjusted to a 
common length. The correlation between 
these two ranks is .g1. When these tests are 
adjusted to a common length, it is readily 
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TABLE II 


{ue TOTAL AMOUNT OF ERROR BY SUBTEST, THE TOTAL ERROR BY SUBTEST WHEN 


THE 


Ane EQUATED TO TWENTY ITEMS, AND THE AVERAGE ERROR FOR PAPERS HAVING ERRO 


Subtest 


49 
55 
104 
6 


if 


10 
48 
88 

4 


R ink 
Average 


each test had 2 


Error 2 6. 2. 4 40 
Median Error 2. : 1.0 


10 


19 
é 
125 


201 
9 


items 


28 41 
106 56 
134 97 


‘ 0 
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9 


1 ¥ 
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TABLE III 


Tue Tests CLASSIFIED AS TO TYPE AND RANKED ACCORDING TO (a) NUMBER OF TESTS HAVING 


ERRORS AND (b) NUMBER OF 


Papers in Error 


taw Data 
Type of Test Test 
Rank 
1. Two Choice, Indicating 3 10 
a Single Response on } 6 ) 
the Margin 


Indi- 
Word 


Multiple Choice, 
cating a Single 
in the Text 


Choice, Indi- 
Words in 


Multiple 
cating Two 
the Text 
. Writing 2 
Correct Response 5 
on Margin 10 


seen that the multiple choice tests where a 
single word in the text is indicated as the 
answer show the fewest errors, closely fol- 
lowed by the tests where a single correct 
response is written in the margin. 

The most difficult test to score is the one 
in which a choice of two answers is indicated 
on the margin, as same—opposite, yes—no, or 
true—-false. Equally difficult to score is the 
test where two words must be underlined in 
the test. The mean number of papers in 
error by type of test are 35 for type two 
(multiple choice, indicating a single word in 
the text), 46 for type four (writing correct 
response in margin), 78 for type one (two 
choice indicating a single response on the 


Mean 
Rank 


ERRORS PER TEST FOR UNADJUSTED AND ADJUSTED DATA 


Errors Per Test 
Raw Data Corrected Data 
Mean Mean Mean 
Rank Rank Rank Rank Rank Rank 
8.5 10 10 
5 8.5 x 9 7 
10 9 8 


Corrected Data 


3. 
5 
1 


margin), and 77 for type three (multiple 
choice, indicating two words in the text. 

The size of the samples of types of tests 
are so small that further statistical refinement 
is not necessary, but the evidence seems be- 
yond cavil that tests 3, 4, 6, and 8 represent 
types that are more difficult to score than 
are the other types studied. 

The question arises, do scorers tend to 
underscore or overscore? Since the number 
of errors in the adjusted series is 1286, halt 
of these should be positive and half should 
be negative if chance alone determined the 
distribution. It cannot, however, be assumed 
that these are spread equally throughout the 
ten tests, which would give twenty categories 
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nineteen degrees of freedom, since it has 
) shown that chance alone does not oper- 
ite in the distribution from subtest to sub- 
test. It is, however, legitimate to assume 
within a test the errors should be equally 
ributed. The total number of positive 
s is 567 and the number of negative 
rs 719. Computing chi square on this as- 
imption gives a value of 102.12, which is far 
excess of the one per cent level with nine- 
teen degrees of freedom, namely 36.191. 
[here is a decided tendency for this group to 
erscore rather than to overscore. 


Since these data are based on an intelli- 
rence test, it seems appropriate to determine 
number of students whose I. Q. was in- 
rrectly determined. Table IV shows the 
iber of students whose I. Q. was under-or 
ver-estimated, together with the size of the 
error. The two hundred twenty-five cases 
that had errors in the I. Q. are due almost 
wholly to errors in scoring. Only ten of the 
errors, 2.5%, were due to mistakes in deter- 
mining the chronological age, which was com- 
puted by means of a table. It should be 
noted that there are 47 cases, 11.8% of all 
ases considered, where the I. Q. is in error 
by nine points or more. Thus, one child in 
eight, approximately, is misplaced by nine or 
more points of I. Q. There seems to be no 
juestion that tests on which I. Q.’s are to be 
letermined should be rescored. 
In conclusion, it appears that some types 


ORING ERRORS 


TABLE IV 


[HE NUMBER OF I.Q.’s THAT WERE UNDER- AND OVER-ESTIMATED TOGETHER WITH THE 
MAGNITUDE OF THE ERRORS 











of questions are more likely to be misscored 
than others. This seems particularly true of 
the true-false, yes-no, and same—opposite 
tests where the subject is required to under- 
line one of the terms, and for tests where the 
subject underlines two words in the text. If 
this is not an artifact of the particular group 
of scorers used, and if the use of items in 
this form should not be discontinued, either 
the tests must be rescored or some mechan- 
ical method of scoring must be utilized. 
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